(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property OrganizatioD 

International Bureau 

(43) International Publication Date 
7 August 2003 (07.08.2003) 




PCT 



(10) International Publication Number 

wo 03/064599 A2 



(51) International Patent Classification^: C12N 

(21) International Application Number: PCTAJS03/01943 

(22) International Filing Date: 24 Januaiy 2003 (24.01.2003) 



(25) Filing Language: 

(26) Publication Language: 



English 
English 



(30) Priority Data: 






10/054,935 


25 Januaiy 2002 (25.01.2002) 


US 


60/356,130 


14 February 2002 (14.02.2002) 


US 


10/102,946 


22 March 2002 (22.03.2002) 


US 


10/117,229 


8 April 2002 (08.04.2002) 


US 


10/144,198 


14 May 2002 (14.05.2002) 


us 


10/197.824 


19 July 2002 (19.07.2002) 


us 



< 

OS 
OS 



(71) Applicant (for all designated States except US): ORI- 
GENE TECHNOLOGIES, INC [USAJS]; 6 Taft Court, 
Suite 100, Rockville, MD 20850 (US). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): SUN, Zairen 
[CN/US]; 1083 Copperetone Court, Rockville, MD 20852 
(US). LI, Xuan [US/US]; 14808 Carona Drive, Silver 
Spring, MD 20905 (US). JAY, Gilbert [US/US]; 5801 
Nicholson Lane, North Bethesda, MD 20852 (US). KO- 
VACS, Karl, F. [US/US]; 5 Gruenther Court, Rockville, 
MD 20851 (US). FAN, Wufang [US/US]; 10790 Roselle 
Street, San Diego, CA 92121 (US). SHU, Youmin 
[US/US]; 2508 Chilham Place, Potomac, MD 20854 (US). 

(74) Agent: LEBOVITZ, Richard, M.; Origene Technologies, 
Inc, Suite 100, 6 Taft Court, Rockville, MD 20850 (US). 

(81) Designated States (national): AE, AG, AL, AM, AT, AU, 

AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, ES, FI, GB, GD, GE, GH, 
GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, 
LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, 
MX, MZ, NO, NZ, PL. PT, RO, RU, SD, SE, SG, SK, SL, 
TJ. TM, TR, TT, TZ. UA, UG, US, UZ,^YN. YU, ZA. ZW. 

(84) Designated States (regional): ARIPO patent (GH, GM, 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW), 



Eurasian patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European patent (AT, BE, BG, CH, CY, CZ, DE, DK, EE, 
ES, FI, FR, GB, GR, HU, IE, IT, LU, MC, NL, PT, SE, SI, 
SK, TR), OAPI patent (BF, BJ, CF, CG, CI, CM, GA, GN, 
GQ, GW, ML, MR, NE, SN, TD. TG). 

Declarations under Rule 4.17: 

— as to applicant 's entitlement to apply for and be granted 
a patent (Rule 4,17(ii)) for the following designations AE, 
AG, AL, AM, AT, AU, AZ, BA, BB, BG, BR, BY, BZ, CA, CH, 
CN, CO, CR, CU, CZ, DE, DK, DM, DZ, EC, EE, ES, FI, 
GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, 
KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, 
MN, MW, MX, MZ, NO, NZ, PL, PT, RO, RU, SD, SE, SG, 
SK, SL, TJ, TM, TR, TT, TZ, UA, UG, UZ, VN, YU, ZA, ZW, 
ARIPO patent (GH, GM, KE, LS, MW, MZ, SD, SL, SZ, TZ, 
UG, ZM, ZW), Eurasian patent (AM, AZ, BY, KG, KZ, MD, 
RU, TJ, TM), European patent (AT, BE, BG, CH, CY, CZ, 
DE, DK, EE, ES, FI, FR, GB, GR, HU, IE, IT, LU, MC, NL, 
PT, SE, SI, SK, TR), OAPI patent (BF, BJ, CF, CG, CI, CM, 
GA, GN, GQ, GW, ML, MR, NE, SN, TD, TG) 

— as to the applicant 's entitlement to claim the priority of the 
earlier application (Rule 4. 1 7 (Hi)) for the following desig- 
nations AE, AG, AL, AM, AT, AU, AZ, BA, BB, BG, BR, BY, 
BZ, CA, CH, CN, CO, CR, CU, CZ, DE, DK DM, DZ, EC, 
EE, ES, FI, GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, 
JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, 
MD, MG, MK, MN, MW, MX, MZ, NO, NZ, PL, PT, RO, RU, 
SD, SE, SG, SK, SL, TJ, TM, TR, TT, TZ, UA, UG, UZ, VN, 
YU, ZA, ZW, ARIPO patent (GH, GM, KE, LS, MW, MZ, 
SD, SL, SZ, TZ, UG, ZM, ZW), Eurasian patent (AM, AZ, 
BY, KG, KZ, MD, RU, TJ, TM), European patent (AT, BE, 
BG, CH, CY, CZ, DE, DK EE, ES, FI, FR, GB, GR, HU, 
IE, IT, LU, MC, NL, PT, SE, SI, SK TR), OAPI patent (BF, 
BJ, CF, CG, CI, CM, GA, GN, GQ, GW, ML, MR, NE, SN, 
TD, TG) 

— of inventorship (Rule 4, 1 7(iv)) for US only 
Published: 

— without international search report and to be republished 
upon receipt of that report 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin- 
ning of each regular issue of the PCT Gazette, 



(54) Title: CANCER GENES 

2 (57) Abstract: The present invention relates to all facets of novel polynucleotides, the polypeptides they encode, antibodies and 
specific binding partners thereto, and their applications to research, diagnosis, drug discovery, therapy, clinical medicine, forensic 

Q science and medicine, etc. The polynucleotides are differentially expressed in prostate and breast cancers and are therefore useful in 
variety of ways, including, but not limited to, as molecular markers, as drug targets, and for detecting, diagnosing, staging, monitor- 
ing, prognosticating, preventing or treating, determining predisposition to, etc., cancers. 
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CANCER GENES 

This application claims the benefit of U.S. Application Serial No. 10/054,935, filed 
2002-01-25; 60/356,130, filed 2002-02-14; 10/102,946, filed 2002-03-22; 10/1 17,229, filed 
5 2002-04-08; US 10/144,198, 2002-05-14; 10/197,824, 2002-07-19, which are hereby 
incorporated by referrace in their entirety. 

DESCRIPTION OF THE DRAWINGS 
Figs. 1-18 show amxao add sequence alignments betwe^i polypeptides of the present 

1 0 invention, and polypeptides listed in public databases. SEQ ID NOS for the polypeptides of 
flie present invention are listed in Tables 3 and 4. Others are as follows: KIAA0803 (SEQ ID 
NO 31); KIAA0408 (SEQ ID NO 32); NM_030817 (SEQ ID NO 33); NM_015384 (SEQ ID 
NO 34); NM_133433 (SEQ ID NO 35); XMJQ33473 (SEQ ID NO 36); XM_059862 (SEQ 
ID NO 37); NML012062 (SEQ ID NO 38); NM_012063 (SEQ ID NO 39); NM_005690 

15 (SEQ ID NO 40); XM_042775 (SEQ ID NO 41); NM_000125 (SEQ ID NO 42); 

XM_094949 (SEQ ID NO 43); XM_050424 (SEQ ID NO 44); KIAA0534 (SEQ ID NO 76); 
KIAA1217 (SEQ ID NO 77); KIAA0301 (SEQ ID NO 78); AF441770 (SEQ ID NO 79); 
XM_085817 (SEQ ID NO 80); AK001276 (SEQ ID NO 81); XM_033473 (SEQ ID NO 82); 
AK0222Q7 (SEQ ID NO 83). 

20 Fig 19 shows differential display patterns for genes of the present invention. The 

white arrowhead indicates the position of a DNA fragment derived from a differentially 
regulated gene ofthe present invention. The experiments were performed in duplicate. Each 
sample set (4 lanes) contains a duplicate from normal (left) prostate tissue and a duplicate 
tumor (right) prostate tissue from the same individual. There are several sample sets for each 

25 gene. 

Fig. 20 (A-G) shows the amino acid alignments of human kidins2220 variants 
(XM_045362, SEQ ID NO 90; and AB033076, SEQ ID NO 91) and rat variants (AF239045, 
SEQ ID NO 94; and AF313464, SEQ ID NO 93). The referenced numbers are GenBank 
identifiers. 

30 Fig. 21 shows amino acid alignments between Uib-ctf C*BCU1041," SEQ ID NO 96), 

AK014463 (mouse, SEQ ID NO 98) and XM_058887 (human, SEQ ID NO 97), Regions of 
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sequence identity are shaded. 

Fig. 22 is the alignment of the amino acid sequences of human BCU399 (SEQ ID NO 
100), human XM_059670 (SEQ ID NO 101), a partial sequence for BCU399, and monkey 
AB071059 (SEQ ID NO 104). 

5 

DESCRIPTION OF THE INVENTION 

The present invention relates to all facets of genes which are difierentially regulated 
in cancer, polypeptides encoded by them, antibodies and specific binding partners thereto, 
and their applications to research, diagnosis, drug discovery, therapy, clinical medicine, 

10 forensic science and medicine, etc. The polynucleotides and polypeptides are useful in 

variety of ways, including, but not limited to, as molecular markers, as drug targets, and for 
detecting, diagnosing, staging, monitoring, prognosticating, preventing or treating, 
determining predisposition to, etc., diseases and conditions of the breast and prostate, 
especially canc^. The identification of specific genes, and groups of genes, expressed in 

IS patiiways physiologically relevant to prostate and breast permits the definition of fimctional 
and disease pathways, and the delineation of targets in these pathways which are useful in 
diagnostic, therapeutic, and clinical applications. The present invention also relates to 
methods of using the polynucleotides and related products (proteins, antibodies, etc.) in 
business and computer-related methods, e.g., advertising, displaying, offering, selling, etc., 

20 such products for sale, commercial use, licensing, etc. 

No single gene or protein has been identified which is responsible for the etiology of 
all prostate and breast cancers. For example, although prostate specific antigen CTS A") is 
widely used as a diagnostic reagent, it has limitations in its sensitivity and its ability to detect 
early cancers. It is estimated that ^proximately 20% to 30% of tumors will be missed when 

25 PSA is used alone. As a result, diagnostic and prognostic markers for cancer will involve the 
identification and use of many different genes and gene products to reflect its multi&ctorial 
origin. With this in mind, combinations of the differentially-expressed genes of the present 
invention can be used as diagnostic and prognostic markers for prostate and breast cancers. 
A continuing goal is to characterize the gene expression patterns of the various 

30 cancers to genetically differentiate them, providing important guidance in preventing, 
diagnosing, and treating cancers. Molecular pictures of cancer, such as the pattern of 
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differentially-regulated genes identified Iierein, provide an important tool for molecularly 
dissecting and classifying cancer, identifying drug targets, providing prognosis and 
therapeutic information, etc. For instance, an array of polynucleotides corresponding to 
genes differentially regulated in prostate or breast cancer can be used to screen tissue samples 
for the existence of cancer, to categorize the cancer (e.g., by the particular pattem observed), 
to grade the cancer (e.g., by the number of up- or down-regulated genes and thdr amounts of 
expression), to identify the source of a secondary tumor, to screen for metastatic cells, etc. 
These arrays can be xised in combination with other markers, e.g., PSA, PMSA (prostate 
membrane specific antigen), or any of the grading systems used in clinical medicine. 

As indicated by these studies, cancer is a highly diverse disease. Although all cancers 
share certain characteristics, the underlying cause and disease progression can differ 
significantly &om patient to patient. So far, over a dozaa distinct genes have been identified 
which, when mutant, result in a cancer, hi breast cancer, alone, a handfiil of different genes 
have been isolated which eitiier cause the cancer, or produce a predisposition to it. As a 
consequence, disease phenotypes for a particular cancer do not look all the same. In addition 
to tihe differences in the gene(s) responsible for the cancer, heterogeneity among individuals, 
e.g., in age, health, sex, and genetic background, can also influence the disease and its 
progression. Gene penetrance, in particular, can vary widely among population members. 
Recent studies have shown tremendous diversity in gene expression patterns among cancer 
patients. For these and other reasons, one gene/polypeptide target alone can be insufficient to 
diagnose or treat a cancer. Even a gene which is hi^y differentially-expressed and 
penetrant in cancer patients may not be so hig^y expressed in all patients and at all stages of 
the cancer. By selecting a set of genes and/or the polypeptides they encode, cancer 
diagnostics and therapeutics can be designed which effectively diagnose and treat a 
population of diseased individuals, rather than only a small handful when single genes are 
targeted. 

In accordance with the present invention, genes have been identified which are 
diflferentially expressed in prostate cancer. See, e.g.. Tables 1-5 and below. These genes can 
be fiirfher divided into groups based on additional characteristics of flieir expression and the 
tissues in which they are ^pressed. For instance, gmes can be fiirth^ subdivided based on 
the stage and/or grade of the cancer in which they are expressed. Genes can also be grouped 
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based on their penetrance in a cancer, e-g., expressed in all cancer examined, expressed in a 
certain percentage of cancers examined, etc. Additionally, genes can be categorized by their 
function and/or ttxe polypeptides they encode. This includes, but is not limited to, cellular 
localization, functional activity (e.g., kinase, qrtoskeletal element, or transcriptional factor), 
functional pathway (e-g., protein manu&cture, cell signaling, cell movement, cell adhesion, 
responsivity to cAMP, energy production, etc.), etc. These groupings do not restrict or limit 
the use such genes in therapeutic, diagnostic, prognostic, etc., applications. For instance, a 
gene which is expressed in only some cancers (e.g., incompletely penetrant) may be useful in 
therapeutic appUcations to treat a subset of cancers. Similarly, a co-penetrant gene, or a gene 
which is expressed in prostate cancer and other normal tissues, may be xiseful as a therapeutic 
or diagnostic, even if its expression pattern is not hi^y prostate specific. Thus, the uses of 
the genes or their products are not limited by their patterns of expression. 

In developing reagents for the diagnosis and treatment of a disease, it may be useful 
to know the cellular localization of a differentially expressed polypeptide to determme how 
to use it as a target Proteins which are secreted or on the cell-surface are more readily 
accessible than intracellxilar proteins, and can be, e.g., blocked or inhibited to restore levels to 
normal. 

hi recent years, there have been numerous reports on specific targeting of tumor cells 
with monoclonal antibody-drug conjugates using cell-surface proteins. See, e.g., Chari., Adv. 
DrugDeltv. Res., 31: 89-104 (1998); Pietersz and Krauer, J. Drug Targeting, 2: 183-215 
(1994); Sela et al., m Immunoconjugates, 1 89-216 (C. Vogel, ed. 1987); Ghose et al., m 
Targeted Drugs, 1-22 (E. Goldberg, ed, 1983); Diener et al., rxiAntibody mediated delivery 
systems, 1-23 (J. Rodwell, ed. 1988); Pietersz et al., m Antibody mediated delivery systems, 
25-53 (J. Rodwell, ed- 1988); Bumol et al., m Antibody mediated delivery system, 55-79 (J. 
Rodwell, ed. 1988). Cytotoxic drugs such as methotrexate, daunorubidn, doxorabicin, 
vincristine, vinblastme, melphalan, mitomycm C, and chlorambucil have been conjugated to 
a variety of monoclonal antibodies. Therapeutic agents can be directly conjugated to the 
antibody, or through cleavable linkers which facilitate the release of the agent in active form 
only when it is inside the cell. See, e.g., U.S. Pat No. 6,333,410. 

By the phrase "differential expression," it is meant that the levels of expression of a 
gene, as measured by its transcription or translation product, are different depending iqion the 
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specific cell-type or tissue (e.g,, in an averaging assay that looks at a popxilation of cells). 
There are no absolute amounts by which the gene expression levels must vary, as long as the 
differences are measurable. 

The phrase ''up-regulated" indicates that an mRNA transcript or other nucl^c ac^d 
5 corresponding to a polynucleotide of the present invention is expressed in larger amounts in a 
cancer as compared to the same transcript expressed in normal cells fit>m which the cancer 
was derived. The phrase "down-regulated" indicates that an mRNA transcript or other 
nucleic add corresponding to a polynucleotide of the present invention is expressed in lower 
amounts in a cancar as compared to the same transcript expressed in normal cells from which 

10 the cancer was derived. In general, differential-regulation can be assessed by any suitable 
method, including any of the nucleic add detection and hybridization methods mentioned 
below, as well as polypeptide-based meOiods. Up-regulation also includes going from 
substantially no expression in a normal tissue, from detectable expression in a normal tissue, 
from significant expression in a normal tissue, to higher levels in the cancer. Down- 

1 5 regulation also includes going from substantially no expression in a normal tissue, from 
detectable expression in a normal tissue, from significant expression in a normal tissue, to 
higher levels in the cancer. 

Differential regulation can be determined by any suitable method, e.g., by comparing 
its abundance per gram of RNA (e.g., total RNA, polyadenylated mRNA, etc.) extracted fcom 

20 a prostate tissue in comparison to the corresponding normal tissue. The normal tissue can be 
from the same or dif^ent individual or source. For conv^ence, it can be supplied as a 
separate component or in. a kit in combination with probes and other reagents for detecting 
geaes. The quantity by which a nucleic add is diffrarentially-r^gulated can be any value, e.g., 
about 10% more or less of normal expression, about 50% more or less of normal expression, 

25 2-fold more or less, 5-fold more or less, 10-fold more or less, etc. 

The amount of transcript can also be compared to a different gene in the same sample, 
espedally a gene whose abundance is known and substantially no different in its expression 
between normal and cancer cells (e.g., a '^control" gene). If represented as a ratio, with the 
quantity of differentially-regulated gene transcript in the numerator and the control gene 

30 transcript in the denominator, the ratio would be larger, e.g., in prostate cancer than in a 
sample fit>m normal prostate tissue. 
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DiflFerential-regulation can arise through a number of different mechanisms. The 
present invention is not bound by any specific way through which it occurs. Differential- 
regulation of a polynucleotide can occur^ e.g., by modulating (1) transcriptional rate of the 
gene (e.g., increasing its rate, indudng or stimulating its transcription fiom a basal, low-level 
rate, etc.), (2) the post-transcriptional processing of RNA transcripts, (3) the transport of 
RNA firom the nucleus into the cytoplasm, (4) RNA nuclear and cytoplasmic turnover, and 
polypeptide tumover (e.g., by virtue of having higher stability or resistance to degradation), 
and combinations thereof See, e.g., ToUervey and Caceras, Cell, 103:703-709, 2000. 

A differentially-regulated polynucleotide is useful in a variety of different 
appUcations as described in greater details below. Because it is more abundant in cancer, it 
and its expr^ion products can be used in a diagnostic test to assay for the presence of 
cancer, e.g., in tissue sections, in a biopsy sample, in total RNA, in lymph, in blood, etc. 
Differentially-regulated polynucleotides and polypeptides can be used individually, or in 
groups, to assess the cancer, e.g., to determine the specific type of cancer, its stage of 
develoiraient, the nature of the genetic defect, etc., or to assess the efficacy of a treatment 
modality. How to use polynucleotides in diagnostic and prognostic assays is discussed 
below. In addition, tiie polynucleotides and the polypeptides they encode, can serve as a 
target for therapy or drug discovery. A polypeptide, coded for by a differentially-regulated 
polynucleotide, which is displayed on the cell-surface, can be a target for immunothenqpy to 
destroy, inhibit, etc., the diseased tissue. Differentially-regulated transcripts can also be used 
in drag discovery schemes to identify pharmacological agents which modulate, suppress, 
inhibit, activate, increase, etc., their differential-regulation, thereby preventing the phenotype 
associated with their expression. Thus, a differentially-regulated polynucleotide and its 
expression products of the present invention have significant applications in diagnostic, 
therapeutic, prognostic, drug development, and related areas. 

The expression patterns of the selectively expressed genes disclosed herein can be 
described as a *'fingerprinf' in that they are a distinctive pattern displayed by a tissue. Just as 
with a fingerprint, an expression pattern can be used as a unique identifier to characterize the 
status of a tissue sample. The list of expressed sequences disclosed herein provides an 
example of such a tissue expression profile. It can be used as a point of reference to compare 
and characterize samples. Tissue fingerprints can be used in many ways, e.g., to classify a 
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tissue as prostate cancer, to determine the origui of a metastatic cells, to assess the 
physiological status of a tissue, to determine the effect of a particular treatment regime on a 
tissue, and to evaluate the toxicity of a compound on a tissue of interest, to determine the 
presence of a cancer in a biopsy sample, to assess the efficacy of a cancer therapy in a human 

5 patient or a non-human animal model, to detect circulating cancer cells in blood or a lymph 
node biopsy, etc. While flie expression profile of the complete gene set represented in the 
present invention may be most informative, a fingerprint containing expression information 
from less than the fiill collection can be usefid, as well. In the same way that an incomplete 
fingerprint may contain enough of the pattern of whorls, arches, loops, and ridges, to identify 

10 the individual, a cell expression fingerprint containing less than the fiiU complement may be 
adequate to provide usefid and imique identifying and other information about the sample. 
Moreover, cancer is a multifiictorial disease, involving genetic aberrations in more than gene 
locus. This multifaceted nature may be reflected in different cell expression profiles 
associated with prostate cancers arising in different individuals, in different locations in the 

15 same individual, or even within the same cancer locus. As a result, a complete match with a 
particular cell expression profile, as shown herein, is not necessary to classify a cancer as 
being of the same type or stage. Similarity to one cell expression profile, e.g., as compared to 
another, can be adequate to classify cancer types, grades, and stages. 



20 genes expressed by a cancer tissue. To determine the effect of a toxin on a tissue, a sample 
of tissue is obtained prior to toxin exposure ("contror*) and then at one or more time points 
after toxin exposure ("experimental")- An array of tissue-selective probes can be used to 
assess the expression pattems for both the control and experimental samples. Methods of 
making and using arrays are described below. 



Urb-ctf(BCU1041FB) 

Urb-ctf Cnjp-Regulated Breast Cancer Transcription Factor^' or BCU1041FB or 
FB2847A1 1) codes for a transcription regulatory fiictor having 614 amino acids which is up- 
regulated in breast cancer. The nucleotide and anuno acid sequences of Urb-ctf are sbown in 
30 SEQ ID NOS 95 and 96. It contains a bZBP domain at about amino add positions 228-275, 
conferring DNA-binding activity. It also has a leudne zipper providing a dimerization 



For example, the tissue-selective g^es disclosed herein represent the configuration of 



25 
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activity. There are a number of UniGene clusters that map close to tihe gene, including, e.g., 
Hs.350229, Hs.272458, Hs.350229, Hs.255286, Hs.l84779, and Hs.276916. Predictions 
using GenomeScan (e.g., Yeh et al.. Genome Res. 1 1 : 803-816, 2001) revealed at least two 
different predicted genes, Hsl7_l 1001_27_4_1 and Hsl7_l 1001 JS7_5 J2, instead of the 

5 single gene, Uib-ct^ described herein. A partial human cDNA (AL049450; XM_058887; 
SEQ ID NO 97) for Urb-ctf was previously identified, but this coded for only 198 amino 
adds and contained only a part of the bZIP domain, as well as missing significant portions of 
the N- and C-teraiini. A mouse homolog, AK_014463 (SEQ ID NO 98), has been cloned. 
All or part of Urb-ctf is located in genomic DNA repr^ented by GenBank ID: 

10 AC068669, BAC-ID: RPl 1-749116, and Contig ID: NT__010844. The present invention 
relates to any isolated introns and exons that are present in the gene. Intron and exon 
boundaries can be routinely determined, e.g., using the polypeptide and genomic sequences 
disclosed herein. Using UniSTS probes, Urb-ctf can be chromosomally mc^ped at its S' end 
with UniSTS: 155813 to 40.144Mb, and its 3' end witih UniSTS: 619 to 40.084Mb. 

1 5 Strikingly, the Urb-ctf overlaps with the thyroid hormone receptor alpha 2 gene (CAB57886). 

As indicated by the presence of a bZIP domain, Urb-ctf has transcriptional regulatory 
activity, DNA-binding activity, and dimerization activity. These activities can be determined 
routinely. For example, DNA-binding activity can be determined using gel-shift assays, e.g., 
as carried out in, e.g., U.S. Pat. No. 6,333,407 and 5,789,538. Transcriptional activity can be 

20 determined using conventional transcriptional assays, including in vivo and in vitro assays, 
such as those described in F.M. Ausubel et al., Eds., CURRENT PROTOCOLS IN 
MOLECULAR BIOLOGY (John Wiley & Sons, New York, 1994); de Wet et al., Mol. Cell 
BioL 7:725 (1987); U.S. Pat 6,306,649; U.S. Pat No. 6,214,588; Liao, S. M. et al.. Genes. 
Dev. 5:2431-2440 (1991); Nonet, M., et al.. Cell 50:909-915 (1987). The phrase 

25 '^transcriptional regulatory activity" indicates that the polypeptide modulates transcription in 
analogy to the activity of other bZIP proteins, e.g., by binding to DNA and interacting with 
other proteins of the transcription apparatus. For example, both c-Jim and c-Fos are bZIP 
proteins that form a dimer known as the transcriptional activator AP-1, a transcriptional 
activator. See, e.g.. Genes VU, Lewin, Pages 649-665, 2000. Dimerization activity, i.e., the 

30 ability to form hetero- or homodimers with other proteins (in analogy to the o-fos and c-jun 
system), can be measured routinely, e,g., using the yeast two-hybrid system. 
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Nucleic adds of the pres^it invention map to chromosomal band 17q2Ll. There are a 
number of different disorders which have been mapped to, or in close proximity to, this 
chromosome location. These include, e^g.. Dementia, ftontotemporal, with parldnsonism; 
Neuroblastoma; Osteoporosis, idiopathic; Ehlers-Danlos syndrome, ^rpes I and VHA; 

5 Osteograesis imperfecta; Glanzmann thrombasthenia, type B; Renal cell carcinoma, 
papillary; Thrombocytopenia, neonatal alloimmune; Trichodontoosseous syndrome; 
Hypertension; Epideraiolytic hypCTkeratosis; Hemolytic anemia due to band 3 defect; 
Spherocytosis, hereditary; Gliosis, familial progressive subcortical; Renal tubular acidosis, 
distal; Patella aplasia or hypoplasia; and Pseudohypoaldosteronism type II- Nucleic acids of 

10 the present invention can be used as linkage markers, diagnostic targets, therapeutic targets, 
for any of the mentioned disorders, as well as any disorders or genes mapping in proximity to 
it 

In addition to its expression in breast cancer, Urb-ctf can be detected in most tissues 
examined, but either none, or at very low levels, in normal breast tissue. Multiple forms of it 

15 can be detected in the brain, muscle, testes, and thymus. As these r^ults indicate, Uib-ctf 
has a normal functional role in most tissues, and can consequmtly be involved with diseases 
associated with them, as well. For instance, Urb-ctf can be involved in renal cell carcinoma 
and familial gliosis disease. As discussed earlier, no single gene is responsible for all breast 
cancers. Thus, the fact that Urb-ctf is iip-regulated in the breast cancers examined herein 

20 does necessarily mean that it will be up-regulated in all faiunan breast cancers. 



BCU399 

Human BCU399 codes for a polypeptide of 487 amino adds, which is upr^gulated in 
breast cancer, in both early stage ductile carcinoma and in late stage invasive carcinoma. 

25 The nucleotide and amino add sequraces of human BCU399 are shown in SEQ ID NOS 99 
and 100. It contains seven transmembrane domains at about andno acids 106-128, 135-157, 
172-194, 231-253, 268-285, 367-389 and 458-480 of SEQ ID NO 100, a signature of the G 
protein-coupled receptor family. It contains a nucleotide-binding site motif (P-loop) at 
about amino adds 53-60, indicating that it is a purinergic receptor liganded by nucleotides, 

30 including ATP, ADP, GTP, GDP, UTP, and/or UDP. It contains a G-protdn binding motif 
at about amino adds 63-75. It contains N-glycosjdation moti& at about antiino adds 34-37, 



wo 03/064599 




PCTAJS03/01943 



-10- 



135-138, 203-206 and 397-400. It contains phosphoiylation motife, important for regulatory 
functions* including PKC pho^horylation moti& at about amino acids 36-38, 227-229, 262- 
264, 313-315, and 445-447 of SEQ ID NO 100; CK2 phosphorylation motife at about amino 
acids 44-47, 60-63, 89-92, 91-94, and 356-359; and a tyrosine kinase phosphorylation motif 
5 at about amino acids 59-66. The human BCU399 contains 12 exons. The present invention 
relates to any isolated introns and exons that are present in the gene. Intron and exon 
boundaries can be routinely determined, e.g., using the sequences disclosed herein. 

A partial sequence for human BCU399 was previously identified (Accession Number 
XM_059670), but this sequence (SEQ ID 101) was incomplete, coding for only 153 amino 
10 adds (See Fig. 22, "Human"). Its homolog was identified in monkey (Accession Number 
AB071059), but this sequence was also incomplete, coding for only 360 amino acids (See 
Fig 22, **Monkey"). Monkey BCU399 (SEQ ID NO 102) lacks flie first 127 amino adds of 
human BCU399 (SEQ ID NO 100) but shares about 99% amino add sequence identity to 
human BCU399 along its entire length of 360 ammo adds, with three amino adds different 
15 firom human BCU399 at about positions 187, 238, and 412 (See Fig 22). Related genes 
have been identified in human, mouse, and Drosophila. Bcu0399 shares 48% identity with 
the related fall-length human sequence XM_009330; 46% identity wilh its mouse homolog 
BC021367; and 41% identity with its fly homolog AE003546. The functions of these 
homologs are unknown, and all tliree lack the nucleotide-binding site of BCU399, indicating 
20 that they are functionally diffaent fiom BCU399. 

Because of the upregulation of BCU399 m breast cancer tissue, its polynucleotides, 
polypeptides, and peptides can be used as diagnostic, therapeutic, and researdi tools in 
breast cancer. Upregulation can be routinely assessed by, e.g.,RT-PCR. Antibodies and 
otiier BCU399 Ugands can be used to selectively target agents to breast tissue for purposes 
25 including, but not limited to, imaging, diagnostic, therapeutics, etc. In addition to its 
assqdation wifli breast cancer, BCU399 is also expressed in lymphocytes and in adrenal, 
brain, kidney, lung, lymph node, breast, muscle, ovary, prostate, stomach, testis, tiiymus and 
tiiyroid tissue. 

Imaging of tissues can be fedhtated using agents such as BCU399 ligands tiiat can 
30 be used to target contrast agents to a specific site in die body. Various imaging techniques 
have heea used in tiiis context, including, e.g„ X-ray, CT, CAT, MRI, ultiasound, PET, 
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SPECT, and sdntographic. A reporter agent can be conjugated or associated lontinely with 
a BCU399 ligand. Ultrasound contrast agents combined with ligands such as antibodies are 
described in, e.g., U.S. Pat. Nos 6,264,917; 6,254,852; 6,245,318; and 6,139,819. MRI 
contrast agents, such as metal chelators, radionucleotides, paramagnetic ions, etc., combined 
5 with selective targeting agents are also described in fihe literature, e.g., in U.S. Pat Nos. 
6,280,706 and 6,22 1 ,334. The methods described therein can be used generally to associate 
BCU399 and ligands thereof with an agent for any desired purpose. 

An active agent can be associated in any manner with a BCU399 ligand that is 
effective to achieve its delivery to the target. The association of the active agent and the 

1 0 ligand ("coupling") can be direct, e.g., through chemical bonds between the binding ligand 
and the agent or via a linking agen^ or the association can be less direct, e.g., where the 
active agent is in a liposome, or other carrier, and the ligand is associated with the liposome 
sur&ce. In such case, the ligand can be oriented in such a way that it is able to bind to 
BCU399 on the surfaces of breast tissue cells. 

1 5 BCU399 maps to the chromosomal region 5ql4.3. Consistent with its neuronal 

expression, a susceptibility to febrile seizures (Febrile Convulsions, Familial, 4) was 
mapped to this same chromosomal locus. Nakayama et al.. Human Molecular Genetics^ 
9:87-91, 2000. In addition, several other diseases mapped to this location, including, e.g., 
Wagner syndrome and Usher syndrome, both disorders involved in eye disease. Black et al., 

20 Ophthalmology, 106:2074-2081, 1999. Pieke-Dahl et al.. Journal of Medical Genetics, 
37:256-262, 2000. Nucleic acids of the present invention can be used, e.g., as linkage 
markers, diagnostic targets, and therapeutic targets for any of the mmtioned disorders, as 
well as any disordeis or genes mapping in proximity of BCU399. 

Its nucleotide binding properties make BCU399 polypeptides useful for assaying 

25 nucleotides such as ATP, OTP, etc. Various assay methods can be used, includiug filtration 
assays^ column chromatography, etc. where labeled BCU399 polypeptides and/or 
nucleotides are used. BCU399 polypeptides or portions thereof including, e.g., the 
nucleotide binding motif and other motifs important in nucleotide binding can be used as a 
capture moiety. Various detection methods can be used For example, nucleotide binding 

30 and relative concentration can be measured spectroscopically (e.g., EPR spectroscopy). 
BCU399 polypeptides or portions thereof can also be incorporated into column 
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diromatography resins. After binding to the colirain resin, nucleotides can be chemically 
released and measured by commercially available bioluminescence assays (e.g., 
BioWhittakerViaLightHSkit). Competitive binding assays can also be utilized, where 
concentration in an imknown sample is detemiined by its ebUity to compete with labeled 
5 nucleotide. 

Useful human BCU399 polypeptides and corresponding nucleic adds include 
polypeptides comprising amino acids M27, 150-487, 170-200, 230-250, 400-420 and 
fragments fliereof (See SEQ ID NO 100 and Fig. 22). Useful human BCU399 polypeptides 
and corresponding nucleic adds also include the nucleotide binding motif at about amino 

10 acids 53-60; extra-membrane loop sequences at about amino adds 1-105, 129-134, 158-171, 
195-230, 254-267, 286-366, 390-457, and 481-487, and the motif for binding proteins, 
including G-protems, at about amino acids 63-75 (See SEQ ID NO 100 and Fig. 22). The 
nucleic adds that code for BCU399 can be used for the generation o:^ e.g., nucldc add 
probes, mutant sequences, including, e.g., chimeric sequences and antisense sequences, by 

15 PGR. BCU399 polypeptides and corresponding nucldc adds can be used, e.g., to generate 
antibodies for distinguishing between tiie human and monkey forms of BCU399. Its 
polypeptides and corresponding nucleic adds can be used to generate antibodies to the 
receptor surface to be used, e.g., as blocking agents in signal transduction pathways. The 
polypeptides or portions of them maybe incorporated into resins for purification of Ugands, 

20 e.g.,G proteins, nucleotides, naturally-occurring ligands. The polypeptides can be used as 
competitors for ligand binding, e.g., ATP and G protdns, in Ugand-binding assays. 

BCU399 has several activities, including, e.g., nucleotide bindmg, Ugand binding, 
signal transduction, phosphorylation, conformational change, etc. By **nucleotide binding*' 
and 'ligand binding" is meant the covalent or non-covalent assodation of a nucleotide, 

25 protein, or other molecule with one or more amino adds of BCU399, for example, as 
described in Merighietal., British Journal of Pharmacology, 134:1215-26,2001. By 
"signal transduction" is meant the activation of a chain of events fliat alters the 
concentration of one or more small intracellular signaling molecules (second messengers), 
e.g., cyclic AMP, caldum ions, as described in Sabala et al., British Journal of 

30 Pharmacology, 132:393-402, 2001. By ^^phosphorylation'* is meant the covalemt 

attachment to an amino add, e.g., serine, threonine, lyrosine, etc., of a phosphate group 
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ftom a nucleotide, e.g., ATP, GTP, etc., by means of a kinase, e.g., PK2, PKC, tyrosine 
kinase, etc. KausdorS et bI.,FASEB Journal, 4:2SSl'2SS9y 1990. By "confonnational 
change" is meant a change in the tertiaiy or quaternary structure of a polypeptide. 
Ballesteros et al.. Molecular Pharmacology, 60:1-19, 2001. These activities can be 

5 determined routinely. For instance the binding affinity of nucleotides and other ligands 
can be measured with ligands fused to radioactive or fluorescent markers (e.g., y^^P-ATP 
or green fluorescent protein) and visualized by phosphorimager analysis or fluorimetry. 
Signal transduction can be assessed by expression of BCU399 in cells, stimulation by 
appropriate ligands, e.g., nucleotides such as ATP, GTP, etc., or their analogs, and 

10 measuianent of the concentrations of elicited second messengers or byproducts, e.g., Ca 
or cAMP, by, e.g., atomic absorption spectcometiy (ThennoElemental SOLAAR AA 
spectrometers), radioimmunoassay, etc. Phosphorjdation can be assessed by, e.g., 
phosphoi^dation assay systons, (Perkin Elmer FlasbPlate Plus). Ck>nformational change 
can be assessed spectroscopically (circular dichroism, NMR spectroscopy) or using 

15 antibodies to spedfic conformations. 

Human Kidins (PC473) 

Human Kidins220Pc (kinase D-interacting substrate of 220 kDa) codes for a 
polypeptide containing 1715 amino acid. The nucleotide and amino acid sequences of 

20 Kidins220 are shown in SEQ ID NOS 88 and 89. It contains 1 1 ANK domains at about 

amino add positions 37-66, 70-99, 103-132, 137-166, 170-199, 203-232, 236-265, 269-298, 
302-331, 335-364, and 368-399. Foxir transmembrane domains are located at about amino 
add positions 496-518, 525-547, 659- 681, and 688-707. There is a SAM domain at about 
amino adds 1 151-1223. It contains cAMP and cGMP protein kinase phosphorylation site 

25 motife at about 880-883, 901-904, 1250-1253, 1438-1441, and 1524-1527; protein kinase C 
phosphorylation sitemotife at about 167-169, 219-221, 233-235, 381-383, 471-473, 562-564, 
590-592, 722-724, 791-793, 904-906, 939-941, 950-952, 998-1000, 1012-1014, 1034-1036, 
1180-1182, 1298-1300, 1320-1322, 1351-1353, 1441-1443, 1567-1569, 1677-1679, and 
1681-1683; ATP/GTP-binding site motif A (P-loop) at about amino add positions 467-474; 

30 and tyrosine phosphorylation site motifs at 403-409 and 1397-1404. Its N- and C-terminus 
are cytoplasmic. A UniGene cluster is represented by Hs.9873. 

There are several alternative forms of Kidins220Pc (e.g., diffdent sequences as a 
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result of alternative splicing, etc.). AB033076 (Fig. 20; SEQ ID NOS 91) appears to a 
complete cDNA having an insertion of about 57 amino acids after human Kidins220Pc 
residue 1138, as well as containing an addition amino add residue, Q, at about amino acid 
position 136. See, Fig. 20. AB033076 also has a six-amino add extension at its N-temmms, 
5 LQLSVK (SEQ ID NO 92), which is not shown. XM_045362 (Fig. 20; SEQ ID NOS 90) is 
a partial and incomplete EST for human Kidins220Pc, missing from about amino add 1 138. 
See, Fig. 20. It contains the above-mentioned insertion, making it closer to the AB033076 
variant In addition to the Q residue at position 136, the following sequences (polypeptide 
and correspondmg nucleotide) can be used to distinguish the dififerent forms: 1 138-1 184 

10 (SEQ ID NO 90), 1 138-1 176(SEQ ID NO 90), 1 177-1 184 (SEQ ID NO 90), 1 138-1 194 
(SEQ ID NO 91), or 1 177-1 194 (SEQ ID NO 91). 

There are several rat homologs of human Kidins220. AF313464 (Fig. 20; SEQ ID 
NO 93) shares about 92% amino add sequence identity and 95% amino acid homology along 
its entire length. Like the human Kidins220Pcfonn, this rat homolog does not contain the 

15 amino add insertion present in AB033076, but it does contain Ihe Q residue at 136. 

AF239045 (Fig. 20; SEQ ID NO 94) is another rat homolog, closer to the AB033076 form, 
having about 91% amino add sequence identity and 93% amino add homology along its 
entire length to human kidins220Pc. A C. elegans homolog is NM_069656 and a Drosophila 
homolog is AE003453, 

20 All or part of Kidins220 is located in genomic DNA represented by GenBank ID: 

AC012495.8 and Contig ID: NTJ022194. The present invention relates to any isolated 
introns and exons that are present in the gene. Intron and exon boundaries can be routinely 
determined, e.g., using the polyp^tide and genomic sequences disclosed herein. 

Human Kidins220Pc maps to chromosomal band 2p25. 1 . Hereditary essential tremor 

25 (OMIM 602134) maps to this location. Nucldc adds of the present invention can be used as 
linkage markers, diagnostic targets, therapeutic targets, for this disorder^ as well as any 
disorders or genes mapping in proximity to it. 

Kidins220 was originally identified as a substrate protein kinase D CTKD'*), a 
serine/threonine kinase regulated by diacjdglycerol signaling patiiways. See, Iglesias, J. Biol. 

30 Chem., 275:40048-40056, 2000. It is phosphoryiated by PKD at the serine at position 919, 
and its first physiologically-occurring substrate. See, Iglesias et al.. Thus, human 
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Kidiii220Pc can used as a substrate in assays for PKD activity. See, e.g., Iglesias et al. for 
how such assays can be carried out. 

La addition to its association with prostate cancer, Kidihs220Pc expression can be 
affected in other tissues, as well. For example, Iglesias et al. reported that it is expressed at 
S vecy hi^ levels in the brain and has a role in neurite ougriwtti, making it useful for the 
treatment and analysis of neurodegenerative diseases, including spinal cord injuries, 
Parkinson's disease, Alzheimer's disease, multiple sclerosis, traumatic head injury, etc. For 
example, modulation of human kidins220Pc can be utilized to regulate neurite outgrowth and 
subsequent synaptogenesis. 

10 

DEPTA genes 

DEPTA-1, -2, and -3 (Pcp409, Pcp461, and Pcp578, respectively) are differentially 
expressed prostate tumor antigen genes C^EPTA'') that are highly up-regulated in prostate 
cancers. DEPTA-1 (SEQ ID NO 84) and DEPTA-2 (SEQ ID NO 85) are non-coding 

1 5 transcripts. DEPTA-2 is encoded by three exons. DEPTA-1 is only a single exon and is 
located in the intron region of DEPTA-2. The present invention relates to the nucleic 
fragments comprising the individual introns and exons of the DEPTA- 1/2 cluster. 
Expression of DEPTA-1 is highly restricted to the prostate, and substantially no other tissues. 
DEPTA-2 is not as highly restricted to prostate, but is expressed in testis and stomach tissue, 

20 as well. 

DEPTA-3 codes for a polypeptide containing 139 amino adds. The nucleotide and 
amino add sequences of DEPTA-3 are shown m SEQ ID NOS 86 and 87. DEPTA-3 is a 
highly-charged polypeptide having a putative phosphorylation site at about anoino add 
residues 49-51. It has homology to other proteins which have binding activity, suggesting 

25 that it binds to a nucldc add or protein binding partner. DEPTA-3 is expressed in normal 
prostate, as well as kidney, heart, stomach, pancreas, and thyroid. 

All or part of DEPTA-3 is located in genomic DNA represented by GenBank ID: 
AC018601, BAC-ID: RPl 1-28G15, and Contig ID: NT_0054207. Its 5' end is assodated 
with UniGene cluster Hs. 1 35941 . The pr^ent invention relates to any isolated introns and 

30 exons that are present in the gene. Intron and exon boundaries can be routinely determined, 
e.g., using the polypeptide and genomic sequences disclosed herem. DEPTA-3 maps to 
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chromosomal band 2pl3. 
Ofher genes 

Membrane (i-e., cell-surface) proteins coded for by up-regulated genes (e.g., 
5 PCP0816) are usefiil targets for antibodies and other binding partners (e.g., ligands, aptamers, 
small peptides, etc.) to selectively target agents to a breast cancear tissue for any purpose, 
included, but not limited to, imaging, therapeutic, diagnostic, drug delivery, gene therapy, etc. 
For example, binding partners, such as antibodies, can be used to treat carcinomas in analogy 
to how c-erbB-2 antibodies are used to breast cancer. Membrane (e.g., PCP0405 when shed 

10 into the blood and other fluid) and extracellular protems (e.g., PCP0389 or PC3»0664) can 
also be used as diagnostic markers for cancer, and to assess the progress of the disease, e.g., 
in analogy to how PSA levels are used to diagnose prostate canc^. Useful antibodies or othw 
binding partners include those that are specific for parts of the polypeptide which are exposed 
extracellularly as indicated in Table 1 and 4. Tables 3 and 4 summarize the ^ression profile 

15 of these genes. 

Polynucleotides of the present invention can also be used to detect metastatic cells in 
the blood. For instance, PCP0389, PCP0814, PCP0424, PC0382, PCP0840, PCP0842, 
PCP0405, PC0177, PCP0677, and PCP0806 are absent from peripheral blood cells, and can 
therefore be used in diagnostic tests to assess whether prostate cancer cells have metastasized 

20 from the primary site. 

Polynucleotides of the present invention have been mq)ped to spedfic chromosomal 
bands. Different human disorders are associated with these c^omosome location^^ See, 
Tables 2 and 5. The polynucleotides and polypeptides they encode can be used as linkage 
markers, diagnostic targets, therapeutic targets, for any of the mentioned disorders, as weU as 

25 any disorders or genes m^>ping in proximity to them. Of particular interest are those genes 
which map to cancer loci, such as PCP0749, PCP0814, PCP0816, PCP0405, PCP0459, 
PCP0677, and PCP0762. 

The present invention relates to the complete polynucleotide and polypeptide 
sequences disclosed herem, as well as fiagments thereof. Useful fragments include those 

30 which are unique and which do not overlq? any known gene (e.g., amino acid residues 1 -394 
of SEQ ID NO 2 of PCP0749), which overly with a known sequence (e.g., amino acids 
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residues 395-1564 of SEQ ID NO 2 of PCP0749, which span alternative splice jtinctions 
(e.g., comprising amino acid residues 585-586 of PCP0424A of SEQ ID NO 18), which are 
imique to a public sequence as indicated in the figures (e.g., e.g., amino adds residues 2149- 
2265 of NM_133433 of SEQ ID NO 35), which span an alternative splice junction of a 
5 public sequence (e.g., 532-533 of NM_005690 of SEQ ID NO 40), etc. Unique sequences 
can also be described as being specific for a gene because they are characteristic of the gene, 
but not related genes. The unique or specific sequences included polypeptide sequences, 
coding nucleotide sequences (e.g., as illustrated in the figures), and non-coding nucleotide 
sequences. 

1 0 Below, for illustration, are some examples of polypeptides (included are the 

polynucleotides which encode them); however, the present invention includes all firagments, 
especially of the categories mentioned above are exemplified below. 

PCaP0749 (SEQ ID NO l-2):polypeptides comprising, consisting of, or consisting 
essmtially of about amino acids 1-394, polypeptide fi:agments thereof, and polynucleotides 
1 5 encoding said polypeptides; 

PCP0389 (SEQ ID NO 5-6): polypeptides comprising, consisting of, or consisting 
essentially of about amino acids 1-1-1 17, polypeptide fragments thereof and polynucleotides 
encoding said polypq)tides; 

PCP0814 (SEQ ID NO 9-10): polypeptides comprising, consisting of, or consisting 
20 essentially of about amino adds 1-33, polypeptide fragments thereoi^ and polynucleotides 
encoding said polypeptides; , 

PCP0623 (SEQ ID NO 1 1-12): polypeptides comprising, consisting of, or consisting 
essentially of about anoino adds 1-539, polypeptide fi:agments thereof, and polynucleotides 
encoding said polypeptides; 
25 PCP0815 (SEQ ID NO 13-14): polypeptides comprising, consisting of, or consisting 

essentially of about amiao adds 1-22, 964-1010, 101 1-1041, polypeptide fragments tiiereof, 
and polynucleotides encoding said polypeptides; 

PCP0840 (SEQ ID NO 15-16): polypeptides comprising, consisting of, or consisting 
essentially of about amino adds 1-129, polypeptide fitigments thereof, and polynucleotides 
30 encoding said polypeptides; 

PCP0424A (SEQ ID NO 17-18): polypeptides comprising, consisting oj^ or 
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consisting essentially of about anodno adds 1-53, 585-586, 586-61 1, polypeptide fragments 
thereof, and polynucleotides encoding said polypeptides; 

PCP0424B (SEQ ID NO 19-20): polypeptides comprising, consisting o£ or consisting 
essentially of about amino acids 1-53, 585-586, polypeptide fragments thereof and 
5 polynucleotides encoding said polypeptides; 

PCP0424C (SEQ ID NO 21-22): polypeptides comprising, consisting oi^ or consisting 
essentially of about ajtnino adds 1-53, 585-586, polypeptide fragments thereoJ^ and 
polynucleotides encoding said polypeptides; 

PCP0816 (SEQ ID NO 25-26): polypeptides comprising, consisting of, or consisting 
10 essentially of about amino adds 268-317, 623, 992-1013, polypeptide fragments thereof, and 
polynucleotides encoding said polypeptides; 

PCP0480 (SEQ ID NO 27-28): polypeptides comprising, consisting o^ or consisting 
ess^tially of about amino adds 1-151, 152-171, polypeptide fragments thereof, and 
polynucleotides encoding said polyp^tides; 
1 5 PC0382 (SEQ ID NO 23-24): polypeptides comprising, consisting of, or consisting 

essentially of about amino adds 1-9, polypeptide fragments thereof, and polynucleotides 
encoding said polypeptides; 

PCP0842 (SEQ ID NO 29-30): polypeptides comprising, consisting of, or consisting 
essentially of about amino adds 1-454, polypeptide fragments thereoi^ and polynucleotides 
20 encoding said polypeptides. 

PCP405 (SEQ ID NO 45-46): polypeptides comprising, consisting of, or consisting 
essratially of about amino adds 1-351, 941, polypeptide fragments thereof, and 
polynucleotides encoding said polypeptides. PCP405 has hig^ eiqiression in the adrenal 
gland, brain and pituitary gland, and codes for a polypeptide which comprises domains 
25 characteristic of the attractins and other cell adhesion and guidance proteins. See, e.g., Duke- 
Cohan et al., Proc. Natl. Acad. Scu, 95:1 1336-1 1341, 1998. 

PC0177A (SEQ ID NOS 54-55): polypeptides comprising, consisting of, or 
consisting essentially ofabout amino acids 1-85, 560-594, 1139-1167, 1167-1168, 1168- 
1744, polypeptide fragments fhweoi^ and polynucleotides encoding said polypeptides; 
30 PC0177B (SEQ ID NOS 56^57): polypeptides comprising, consisting o^ or consisting 
essentially ofabout amino adds 1-85, 559-560, 1104-1132, 1132-1133, 1132-1709, 
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polypeptide fragments thereof, and polynucleotides encoding said polypeptides; PC0177C 
(SEQ ID NOS 58-59): polypeptides comprising, consisting o^ or consisting essentially of 
about amino adds 1-85, 559-560, 1104-1132, 1703-1908, polypeptide fragments thereof and 
polynucleotides encoding said polypeptides; PC0177D (SEQ ID NOS 60-61): polypeptides 

5 comprising consisting of, or consisting essentially of about amino acids 1-85, 559-560, 
polypeptide fragments thereof, and polynucleotides encoding said polypeptides. PC0177 
comprise coil-coil domains involved in protein interactions. 

PCP454A (SEQ ID NO 50-51; Fig. 14): polypeptides comprising, consisting of, or 
consisting essentially of about amino acids 1-1890, polypeptide fragments thereof, and 

10 polynucleotides encoding said polypeptides. PCP454B (SEQ ID NOS 48-49) codes for a 577- 
amino acid polypeptide. This polypeptide comprises a nucleotide binding site which can be 
usfed to assay for its activity, e.g., by a filtration-type assay using radioactive ATP or other 
nucleotides. Nucleotide binding can also be used to purify the polypeptide, e.g., using a 
column comprising a nucleotide. PCP454A and B are contiguous, and a transcript has also 

15 been detected (SEQ ID NO 47) which comprises both open reading frames, wh^e 454B is in 
the 5* region, and about 2 kb down from it is 454 A. 

PCP0557 (SEQ ID 62-63): polypeptides comprising, consisting of, or consisting 
essentially of about amino acids 1-237, polypeptide fragments thereof, and polynucleotides 
encoding said polypeptides. PCP0557 polypeptide has a phosphoacceptor domain indicating 

20 that it is involved in signal transduction. This domain (e.g., amino adds 565-620) can be 
used as a substrate in kinase assays, e.g., as desoibed in Kemp et al., 'design and use of 
peptide substrates for protein kinases," Methods in Enzymol, 200: 121-34, 1991 ; Wang et al., 
^Identification of the major site of rat prolactm phosphorylation as serine 177," Jl Biol 
Chem., 271:2462-9, 1996; Yasuda et al., "A synflietic peptide substrate for selective assay of 

25 protein kinase C," Biochem, Biophys. Res, Comm., 166:1220-7, 1990; Gonzalez et al., ^OJse 
of the synthetic peptide nexjrpgranin(28-43) as a selective protein kinase C substrate in assays 
of tissue homogenates," AnaL Biochem., 215:184-9, 1993; Parker et al., 'T)evelopment of 
high throughput screening assays using fluorescence polarization: nuclear receptor-ligand- 
binding and Idnas^phosphatase assays," J. Biomol Screen., 5:77-88, April 2000. See, also., 

30 U.S. Pat Nos. 6,203,994, 6,074,861 , 6,066,462, 6,004,757, and 5,741,689. 

PCP0762 (SEQ ID NO 68-69): polypeptides comprising, consisting of, or consisting 
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essentially of about amino acids 82-86, 1 13-221, polypeptide fragments thereof, and 
polynucleotides encoding said polypeptides. It contains a SCAN domain involved in 
transcriptional regulation. 

PCP0806 (SEQ ID NO 70-71): polypeptides comprising, consisting of, or consisting 
S essentially of about amino acids 3 1 -32, polypeptide fragments th»:eof, and polynucleotides 
encoding said polypeptides. PC0806 is in an intracellular protein that shows high expression 
in lung, pancreas, prostate, and stomach. 

PCP0815A (SEQ ID NO 72-73): polypeptides comprising, consisting of, or 
consisting essentially of about amino acids 1-24, 131-1005, 744, polypeptide fragments 

10 thereof; and polynucleotides encoding said polypeptides; PCP0815C (SEQ ID NO 74-75): 
polypeptides comprising, consisting ol^ or consisting essentially of about amino acids 1-24, 
polypqptide fragments thereof, and polynucleotides encoding said polypeptides. The gene is 
expressed in many tissues, but is highest in brain and pituitary. PCP0815A comprises sev©a 
zinc finger domains, indicating that it is a transcriptional regulator. PCP08 1 5C is missing 

1 5 these transcriptional domains, indicating that it can be a regulator (e-g., a negative regulator) 
ofPCP0815A. 

PCP0664 (SEQ ID NO 64-65) is a 122 anaino acid polypeptide comprising an N- 
temiinal hydrophobic region. It has a signal peptide cleavage site at about between amino 
adds 1 8 and 1 9, indicating that it can be a secreted molecule. 

20 

Nucleic acids 

A mammalian polynucleotide, or fragment thereof, of the present invention is a 
polynucleotide having a nucleotide sequence obtainable from a natural source. When the 
species name is used, e«g., a himian, it indicates that the polynucleotide or polypeptide is 

25 obtainable fix)m a natural source. It therefore includes naturally-occurring normal, naturally- 
occurring mutant, and naturally-occurring polymorphic alleles (e.g., SNPs), differentially- 
spliced transcripts, splice-variants, etc. By the term ^'naturally-occurring," it is meant that the 
polynucleotide is obtainable from a natural source, e.g., animal tissue and cells, body fluids, 
tissue culture cells, forensic samples. Natural sources include, e.g., living cells obtained from 

30 tissues and whole oiganisms, tumors, cultured ceU lines, including primary and immortalized 
cell lines. Naturally-occurring mutations can include deletions (e.g., a truncated amino* or 
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carboxy-teramms), substitutions, inversions, or additions of nucleotide sequence. These 
genes can be detected and isolated by polynucleotide hybridization according to methods 
which one skilled in the art would know, e.g., as discussed below. 

A polynucleotide according to the present invention can be obtained from a variety of 

5 difTerent sources. It can be obtained from DNA or RNA, such as polyadenylated mRNA or 
total RNA, e.g., isolated from tissues, cells, or whole organism. The polynucleotide can be 
obtained directly from DNA or RNA, from a cDNA library, from a genomic library, etc. The 
polynucleotide can be obtained from a cell or tissue (e.g., from an embryonic or adult tissues) 
at a particular stage of development, having a desired genotype, phenotype, disease status, 

10 etc. A polynucleotide which "codes without intemiption" refers to a polynucleotide having a 
continuous open reading frame ("ORP') as compared to an ORF which is intermpted by 
introns or oth^ noncoding sequences. 

Polynucleotides and polypeptides (including any part of a diflferentially regulated 
cancer gene) can be excluded as compositions from the present invention if, e.g., listed in a 

15 publicly available databases on the day this application was filed and/or disclosed in a patent 
application having an earlier filing or priority date than this application and/or conceived 
and/or reduced to practice earlier than a polynucleotide in this application. 

As described herein, the phrase "an isolated polynucleotide which is SEQ ID NO,'' or 
"an isolated polynucleotide which is selected from SEQ ID NO,'' refers to an isolated nucleic 

20 add molecule from which the recited sequence was derived (e.g., a cDNA derived from 
mRNA; cDNA derived from genomic DNA). Because of sequencing enrors, typographical 
errors, etc., the actual naturally-occurring sequence may differ fix>m a SEQ ID listed herein. 
Thus, the phrase indicates the specific molecule fix>m which the sequence was derived, rather 
than a molecule having that exact recited nucleotide sequence, analogously to how a cidture 

25 depository number refers to a specific cloned fragment in a cryotube. 

As explained in more detail below, a polynucleotide sequence of the invention can 
contain the complete sequence as shown in SEQ ID NOS 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 
23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, 
and/or 99, degenerate sequences thereof, anti-sense, muteins thereof, genes comprising said 

30 sequences, fidl-length cDNAs comjuising said sequences, complete genomic sequences. 
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fragments thereof homologs, primers, nucleic acid molecules which hybridize thereto, 
derivatives thereof, etc. 

Genomic 

5 The present invention also relates genomic DNA from which the polynucleotides of 

the present invention can be derived. A genomic DNA coding for a human, mouse, or other 
mammalian polynucleotide, can be obtained routinely, for example, by screening a genomic 
library (e.g., a YAC library) with a polynucleotide of tiie present invention, or by searching 
nucleotide databases, such as GenBank and EMBL, for matches. Promoter and other 

10 regulatory regions (including both 5' and 3' regions, as well introns) can be identified 

upstream or downstream of coding and expressed RNAs, and assayed routinely for activity, 
e.g., by joining to a reporter gene (e.g., CAT, GFP, alkaline phosphatase, ludfiarase, 
galatosidase). A promoter obtained from a differentiaUy regulated cancer gene can be used, 
e.g., in gene thers^y to obtain tissue-specific expression of a heterologous gene (e.g., coding 

15 for a therapeutic product or cytotoxin). 5' and 3 ' sequences (including, UTRs and introns) 
can be used to modulate or regulate stability, transcription, and translation of nucleic acids, 
including the sequence to which is attached in nature, as well as heterologous nucleic adds. 

Constructs 

20 A polynucleotide of the present invention can comprise additional polynucleotide 

sequences, e.g., sequences to enhance expression, detection, uptake, cataloging, tagging, etc. 
A polynucleotide can include only coding sequaice; a coding sequence and additional non- 
naturally occurring or heterologous coding sequence (e.g., sequences coding for leader, 
signal, secretory, targeting, enzymatic, fluorescent, antibiotic resistance, and other fimctional 

25 or diagnostic peptides); coding sequences and non-coding sequences, e.g., untranslated 
sequences at either a 5' or 3' end, or dispersed in the coding sequence, e.g., introns. 

A polynucleotide according to the present invention also can comprise an expression 
control sequence operably Unked to a polynucleotide as described above. The phrase 
"expression control sequence" means a polynucleotide sequence that regulates expression of 

30 a polypeptide coded for by a polynucleotide to which it is frmctionally ("operably*') linked. 
Expression can be regulated at the level of flie niiRNA or polypeptide. Thus, flie expression 
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control sequence includes naRNA-related elements and protein-related elements. Such 
elements include promoters, enhancers (viral or ceUular), ribosome binding sequences, 
transcriptional t^mnators, etc. An expression control sequence is operably linked to a 
nucleotide coding sequence when the expression control sequence is positioned in such a 

5 manner to eflFect or achieve expression of the coding sequence. For example, when a 
promoter is operably linked 5' to a coding sequence, expression of the coding sequence is 
driven by the promoter. Expression control sequences can include an initiation codon and 
additional nucleotides to place a partial nucleotide sequence of the present invention in-frame 
in order to produce a polypeptide (e.g., pET vectors from Promega have been designed to 

10 permit a molecule to be inserted into all three reading frames to identify the one that results 
in polypeptide expression). Expression control sequences can be heterologous or endogenous 
to the normal gene. 

A polynucleotide of the present invention can also comprise nucleic acid vector 
sequences, e.g., for cloning, expression, amplification, selection, etc. Any eflFective vector 

1 5 can be used. A vector is, e.g., a polynucleotide molecule w:hich can replicate autonomously 
in a host cell, e.g., containing an origin of replication. Vectors can be usefril to perform 
manipulations, to propagate, and/or obtain large quantities of the recombinant molecule in a 
desired host. A skilled worker can select a vector depending on the purpose desired, e.g., to 
propagate the recombinant molecule in bacteria, yeast, insect, or mammalian cells. The 

20 following vectors are provided by way of example. Bacterial: pQE70, pQE60i pQE-9 

(Qiagen), pBS, pDlO, Phagescript, phiX174, pBK Phagemid, pNHSA, pNH16a, pNHlSZ, 
pNH46A (Stratagene); Bluescript KS+H (Stratagene); ptrc99a, pKK223-3, pKK233-3, 
pDR54 0, pRTTS (Pharmacia). Eukaryotic: PWLNEO, pSV2CAT, pOG44, pXTl, pSG 
(Stratagene), pSVK3, PBPV, PMSG, pSVL (Phannada), pCR2.1/TOPO, pCRD/TOPO, 

25 pCR4/TOPO, pTrcHisB, pCMV6-XL4, etc. However, any other vector, e.g., plasmids, 

viruses, or parts thereof, may be used as long as they are replicable and viable in the desired 
host. The vector can also comprise sequences which enable it to replicate in the host whose 
genome is to be modified. 

30 Hybridization 
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Polyroicleotide hybridization, as discussed in more detail below, is useful in a variety 
of applications, including, in gene detection methods, for identifying mutations, for making 
mutations, to identify homologs in the same and different species, to identify related 
members of the same gene family, in diagnostic and prognostic assays, in then^eutic 

5 applications (e,g., where an antisense polynucleotide is used to inhibit expression), etc. 

The ability of two single-stranded polynucleotide preparations to hybridize togedier is 
a measxjre of their nucleotide sequence complementarity, e.g^, base-pairing between 
nucleotides, such as A-T, G-C, etc. The invention thus also relates to polynucleotides, and 
their complements, which hybridize to a polynucleotide comprising a nucleotide sequence as 

10 set forth in SEQ ID NOS 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 
52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and/or 99 and genomic 
sequmces thereof A nucleotide sequence hybridizing to the latter sequrace will have a 
complementary polynucleotide strand, or act as a template for one in the presence of a 
polymerase (i.e., an appropriate polynucleotide synthesizing enzyme). The present invention 

1 5 includes both strands of polynucleotide, e.g., a sense strand and an anti-sense strand. 

Hybridization conditions can be chosen to select polynucleotides which have a 
desired amount of nucleotide complementarity with the nucleotide sequences set forth in 
SEQ ID NOS 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 
60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and/or 99 and genomic sequences thereof 

20 A polynucleotide capable of hybridizmg to such sequence, preferably, possesses, e.g., about 
70%, 75%, 80%, 85%, 87%, 90%, 92%, 95%, 97%, 99%, or 100% complementarity, 
between the sequences. The present invention particularly relates to polynucleotide 
sequences which hybridize to tiie nucleotide sequmces set forth in SEQ ID NOS 1, 3, 5, 7, 9, 



1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 



25 74, 84, 85, 86, 88, 95, and/or 99 or genomic sequences thereoi^ under low or high stringenqr 
conditions. These conditions can be used, e.g., to select corresponding homologs in non- 
himian species. 

Polynucleotides which hybridize to polynucleotides of the present invention can be 
selected in various ways. Filter-^e blots (i.e., matrices containing polynucleotide, such as 
30 nitrocellulose), glass chips, and other matrices and substrates comprising polynucleotides 
(short or long) of interest, can be incubated in a prehybridization solution (e.g., 6X SSC, 
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0-5% SDS, 100 ng/ml denatured salmon spenn DNA, 5X Denhardf s solution, and 50% 
foimamideX at 22-68''C, overnight, and thai hybridized with a detectable polynucleotide 
probe under conditions appropriate to achieve the desired stringency. In general, when high 
homology or sequaice id^tity is desired, a hi^ traiperature can be used (e.g., 65 **C). As 

5 the homology drops, lower washing temperature are used. For salt concentrations, the lower 
the salt concentration, the higher the stringency. The length of the probe is another 
consideration. Very short probes (e.g., less than 100 base pairs) are washed at lower 
tOTiperatures, even if the homology is high With short probes, formamide can be omitted. 
See, e.g.. Current Protocols in Molecular Biology^ Chapter 6, Screening of Recombinant 

10 Libraries; Sambrook et al.. Molecular Cloning, 1989, Chaptea: 9. 

For mstance, hi^ stringency conditions can be achieved by incubating the blot 
overnight (e.g., at least 12 hours) with a polynucleotide probe in a hybridization solution 
containing, e.g., about 5X SSC, 0.5% SDS, 100 ^ig/ml denatured sahnon sperm DNA and 
50% foimamide, at 42°C, or hybridizing at 42*^0 in 5X SSPE, 0.5% SDS, and 50% 

15 formamide, 100 |ig/inl denatured salmon sperai DNA, and washing at 65°C in 0. 1% SSC and 
0.1% SDS. 

Blots can be washed at high stringency conditions that allow, e.g., for less than 5% 
bp mismatch (e.g., wash twice in 0.1% SSC and 0.1% SDS for 30 xnin at 65°C), i.e., 
selecting sequences having 95% or greater sequmce identity. 

20 Other non-linoiting examples of high stringency conditions includes a final wash at 

65**C in aqueous buffer containing 30 mM NaCl and 0.5% SDS. Another example of high 
stringent conditions is hybridization in 7% SDS, 0.5 M NaP04, pH 7, 1 mM EDTA at 50°C, 
e.g., overnight, followed by one or more washes with a 1% SDS solution at 42*'C. 
Whereas high stringency washes can allow for, e.g., less than 10%, less than 5% mismatch, 

25 etc., reduced or low stringency conditions can pecmit up to 20% nucleotide mismatch. 

Hybridization at low stringency can be accomplished as above, but using lower formamide 
conditions, lower temperatures and/or lower salt concentrations, as well as longer periods of 
incubation time. 

Hybridization can also be based on a calculation of melting t^perature (Tm) of the 
30 hybrid formed between the probe and its target, as described in Sambrook et al.. Generally, 
the temperature Tm at which a short oligonucleotide (containing 18 nucleotides or fewer) 
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will melt from its target sequence is given by the following equation: Tm = (number of A's 
and T's) x 2°C + (number of C's and G's) x 4*'C, For longer molecules, Tm = 81-5 + 16.6 
logioENal + 0.41(%GC) - 600/N where [Na^ is the molar concentration of sodium ions, 
%GC is the percentage of GC base pairs in the probe, and N is tiie length. Hybridization can 
5 be carried out at several degrees below this temperature to ensure that the probe and target 
can hybridize. Mismatches can be allowed fox by lowering the temperature even further. 

Stringent conditions can be selected to isolate sequences, and their complements, 
which have, e.g., at least about 90%, 95%, 97%, or more, etc., nucleotide complementarity 
between the probe (e.g., a short polynucleotide of SEQ ID NOS 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 
10 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 
88, 95, and/or 99 or genomic sequences thereof) and a target polynucleotide. 

Odier homologs of polynucleotides of the present invention can be obtained from 
mammalian and non-mammalian sources according to various mefliods. For example, 
hybridization with a polynucleotide can be employed to select homologs, e.g., as described in 
15 Sambrook et al.. Molecular Cloning, Chapter 1 1, 1989. Such homologs can have varying 
amounts of nucleotide and amino acid sequence identity and similarity to such 
polynucleotides of the present invention. Mammalian organisms include, e.g., mice, rats, 
monkeys, pigs, cows, etc. Non-mammalian organisms include, e.g., vertebrates, 
invertebrates, zebra fish, chickm, Drosophila, C. elegans, Xenopus, yeast such as S. pombe, 
20 S. cerevisiae, roundworms, prokaryotM, plants, Arabidopsis, artemia, viruses, etc. The 

degree of nucleotide sequence identity between human and mouse can be about, e.g. 70% or 
more, 85% or more for open reading fiames, etc. 

Aligmnent 

25 Alignments can be accomplished by using any effective algorithm. For pairwise 

aligmnents of DNA sequences, the methods described by Wilbur-Lipman (e.g,, Wilbur and 
lipman, Proc. Natl Acad. ScL, 80:726-730, 1983) or Martinez/Needleman-Wunsch (e.g., 
Maxtxa&z, Nucleic Acid Res., 11:4629-4634, 1983) can be used. For instance, if the 
Martinez/NeedlCTian-Wunsch DNA alignment is applied, the miniminn match can be set at 

30 9, gap penalty at 1.10, and gap length penalty at 0.33. The results can be calculated as a 

similarity index, equal to the sum of the matching residues divided by the sum of all residues 
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and gap characters, and then multiplied by 100 to express as a percent. Similarity index for 
related genes at the nucleotide level in accordance with the present invention can be greater 
than 70%, 80%, 85%, 90%, 95%, 99%, or more. Pairs of protein sequences can be aligned 
by die Lipman-Pearson method (e.g., Lipman and Pearson, Science, 227:1435-1441, 1985) 
with k-tuple set at 2, gap penalty set at 4, and gap lengfli penalty set at 12. Results can be 
expressed as percent similarity index, where related genes at the amino add level in 
accordance with the present invention can be greater than 65%, 70%, 75%, 80%, 85%, 90%, 
95%, 99%, or more. Various commercial and free sources of alignment programs are 
available, e.g., MegAlign by DNA Star, BLAST (National Center for Biotechnology 
Ihforaiation), BCM (Baylor College of Medicine) Launcher, etc, BLAST can be used to 
calculate amino acid sequence identity, amino add sequence homology, and nucleotide 
sequence idoitity. These calculations can be made along the entire length of each of the 
target sequences which are to be compared. 

After two sequences have been aligned, a **percent sequence identi^' can be 
detemiined. For these purposes, it is conveniCTt to refer to a Reference Sequence and a 
Compared Sequence, where the Compared Sequence is compared to the Reference Sequence. 
Percent sequence identity can be determined according to the following formula: Percent 
Identity = 100 [1-(C/R)], wherem C is the number of differences between the Reference 
Sequence and the Compared Sequence over die length of aUgmnent between the Reference 
Sequence and the Compared Sequence where (i) each base or amino add in the Reference 
Sequence that does not have a corresponding aligned base or amino add in the Compared 
Sequence, (ii) eadi gap in the Reference Sequence, (iii) each aligned base or amino add in die 
Reference Sequence that is different from an aligned base or amino add in flie Compared 
Sequence, constitutes a difference; and R is the number of bases or amino adds in the 
Reference Sequence over the length of the alignment with the Compared Sequence wifli any 
gap created in the Reference Sequence also being counted as a base or amino add. 

Percent sequence identity can also be detranined by other conventional methods, e.g., 
as described in Altschul et al.. Bull Math. Bio. 48: 603-616, 1986 and Henikoff and 
Henikoff; Proc. Natl Acad, Sci. USA 89:10915-10919, 1992. 



Specific polynucleotide probes 
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A polynucleotide of the present invention can comprise any continuous nucleotide 
sequence of SEQ ID NOS 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 
52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and/or 99 , sequences which 
share sequence identity thereto, or complements thereof The term ''probe" refers to any 

5 substance that can be used to detect, identify, isolate, etc., another substance. A 

polynucleotide probe is comprised of nucleic add can be used to detect, identify, etc., other 
nucleic acids, such as DNA and RNA. 

These polynucleotides can be of any desired size that is effective to achieve the 
specificify desired For example, a probe can be from about 7 or 8 nucleotides to several 

10 thousand nucleotides, depending upon its use and purpose. For instance, a probe used as a 
primer PGR can be shorter than a probe used in an ordered array of polynucleotide probes. 
Probe sizes vary, and the invention is not limited in any way by their size, e.g., probes can be 
from about 7-2000 nucleotides, 7-1000, 8-700, 8-600, 8-500, 8-400, 8-300, 8-150, 8-100, 8- 
75, 7-50, 10-25, 14-16, at least about 8, at least about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 

15 20, 21, 22, 23, 24, 25, 26, or more, etc. The polynucleotides can have non-naturally-occurring 
nucleotides, e.g., inosine, AZT, 3TC, etc. The polynucleotides can have 100% sequence 
identity or complementarity to a sequence of SEQ ID NOS 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 
21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 
95, and/or 99 , or it can have mismatches or nucleotide substitutions, e.g., 1, 2, 3, 4, or 5 

20 substitutions. The probes can be single-stranded or double-stranded. 

In accordance with the preset invention, a polynucleotide can be present in a kit, 
where the kit includes, e.g., one or more polynucleotides, a desired bufifer (e.g., phosphate, 
tris, etc.), detection compositions, RNA or cDNA from different tissues to be used as 
controls, libraries, etc. The polynucleotide can be labeled or unlabeled, witti radioactive or 

25 non-radioactive labels as known in the art. Kits can comprise one or more pairs of 

polynucleotides for amplifying nucleic acids specific for differentially regulated cancer 
genes, e.g., comprising a forward and reverse primer effective in PGR. These include both 
sense and anti-sense orientations. For instance, in PCR-based methods (such as RT-PCR), a 
pair of primers are typically used, one having a sense sequence and the other having an 

30 antisense sequence. 
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Another aspect of the present invention is a nucleotide sequence that is specific to, or 
for, a selective polynucleotide. The phrases "specific fof or "specific to'' a polynucleotide 
have a functional meaning that the polynucleotide can be used to identify the presence of one 
or more target graes in a sample and distinguish tiiem from non-target genes. It is specific m 
the sense that it can be used to detect polynucleotides above background noise C*non-specific 
binding"). A specific sequence is a defined order of nucleotides (or amino add sequences, if 
it is a polypeptide sequence) which occxirs in the polynucleotide, e.g., in the nucleotide 
sequences of SEQ IDNOS 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 
52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and/or 99, and which is 
characteristic of that target sequence, and substantially no non-target sequences. A probe or 
mixture of probes can comprise a sequence or sequences that are specific to a plurality of 
target sequences, e.g., where the sequence is a consensus sequence, a fimctional domain, etc., 
e.g., capable of recognizing a &mily of related genes. Sudi sequroces can be used as probes 
in any of the methods described herein or incorporated by reference. Both sense and 
antisense nucleotide sequences are included. A specific polynucleotide according to the 
present invention can be determined routinely. 

A polynucleotide comprising a specific sequence can be used as a hybridization probe 
to identify the presence of, e.g., human or mouse polynucleotide, in a sample comprising a 
mixture of polynucleotides, e.g., on a Northern bloL Hybridization can be performed xmder 
high stringent conditions (see, above) to select polynucleotides (and their complements which 
can contain the coding sequence) having at least 90%, 95%, 99%, etc., identity (i.e., 
complementarity) to the probe, but less stringent conditions can also be used. A specific 
polynucleotide sequrace can also be fiised in-fi^ne, at eitho: its 5* or 3' end, to various 
nucleotide sequences as mentioned throughout the patent, including coding sequences for 
enzymes, detectable markers, GFP, etc, expression control sequences, etc. 

A polynucleotide probe, especially one that is specific to a polynucleotide of tiie 
present invention, can be used in gene detection and hybridization methods as akeady 
described. In one embodiment, a specific polynucleotide probe can be used to detect 
whether a particular tissue or cell-type is preset in a target sample. To carry out such a 
method, a selective polynucleotide can be chosen which is charactaistic of the deshred target 
tissue. Such polynucleotide is preferably chosen so that it is expressed or displayed in the 
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target tissue, but not in other tissues which are present in the sample. For instance, if 
detection of prostate or breast cancer is desired, it may not matter whether the selective 
polynucleotide is expressed in oflier tissues, as long as it is not expressed in cells normally 
present in blood, e.g., peripheral blood mononuclear cells. Starting fixwn the selective 

5 polynucleotide, a specific polynucleotide probe can be designed which hybridizes (if 
hybridization is the basis of the assay) under the hybridization conditions to flie selective 
polynucleotide, whereby the presence of the selective polynucleotide can be determined. 

Probes which are specific for polynucleotides of the present invention can also be 
prepared using involve transCTiption-based systems, e.g., incorporating an RNA polymerase 

10 promoter into a selective polynucleotide of the present invention, and then transcribing anti- 
sense RNA using the polynucleotide as a template. See, e.g., U.S. Pat No. 5,545,522. 

Polynucleotide composition 

A polynucleotide according to the present invention can comprise, e.g., DNA, RNA, 

1 5 synthetic polynucleotide, peptide polynucleotide, modified nucleotides, dsDNA, ssDNA, 
ssRNA, dsRNA, and mixtures thereof A polynucleotide can be single- or double-stranded, 
triplex, DNAiRNA, duplexes, comprise hairpins, and other secondary structures, etc. 
Nucleotides comprising a polynucleotide can be joined via various known linkages, e.g., 
ester, sulfamate, sulfamide, phosphorothioate, phosphoramidate, methylphosphonate, 

20 carbamate, etc., depending on the desired purpose, e.g., resistance to nucleases, such as 
RNAse H, improved in vivo stsOjihty, etc. See, e.g., U.S. Pat. No. 5,378,825. Any desired 
nucleotide or nucleotide analog can be incorporated, e.g., 6-mercaptoguanine, 8-oxo-guanine, 
etc. 

Various modifications can be made to the polynucleotides, such as attaching 
25 detectable markers (avidin, biotin, radioactive elements, fluorescent tags and dyes, energy 
transfer labels, energy-emitting labels, binding partners, etc.) or moieties which improve 
hybridization, detection, and/or stability. The polynucleotides can also be attached to solid 
supports, e.g., nitrocellulose, magnetic or paramagnetic microspheres (e.g., as described in 
U.S. Pat No. 5,411,863; U.S. Pat No. 5,543,289; for instance, comprising ferromagnetic, 
30 supermagnetic, paramagnetic, superparamagnetic, iron oxide and polysaccharide), nylon. 
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agarose, diazotized ceUulose, latex soUd mictosphetes, polyactylamides, etc., according to a 
desired method See, e.g., U.S. PaL Nos. 5,470,967, 5,476,925. and 5,478,893. 

Polynucleotide according to the present invention can be labeled according to any 

32 3S 

desired method. The polynudeotide can be labeled vising radioactive tracers such as P, S, 
5 ^H, or **C, to mention some commonly used tracers. The radioactive labeling can be carried 
out according to any method, such as, for example, terminal labeling at flie 3' or 5' end using 
a radiolabeled nucleotide, polynucleotide kinase (with or without dephosphorylation with a 
phosphatase) or a Ugase (depending on the end to be labeled). A non-radioactive labeling can 
also be used, combining a polynucleotide of the present invention with residues having 
10 immunological properties (antigens, haptens), a specific affinity for certain reagents 

Oi^ds), propaties enabling detectable enzyme reactions to be completed (enzymes or 
coenzymes, enzyme substrates, or other substances involved in an enzymatic reaction), or 
characteristic physical properties, such as fluorescence or the emission or absorption of light 
at a desired wavelength, etc. 

15 

Nucleic acid detection methods 

Anoflier aspect of the present invention relates to methods and processes for detectuig 
differentially regulated cancer genes. Detection methods have a variety of applications, 
including for diagnostic, prognostic, forensic, and research applications. To accomplish gene 

20 detection, a polynucleotide in accordance with the present invention can be used as a 

♦'probe." The term ••probe" or ••polynucleotide probe" has its customary meanmg in the art, 
e.g., a polynucleotide which is effective to identify (e.g., by hybridization), when used in an 
appropriate process, the presoice of a target polynudeotide to which it is designed. 
Identification can involve simply determining presence or absence, or it can be quantitattve, 

25 e.g., in assessing amounts of a gene or gene transcript present in a sample. Probes can be 
usefiil in a variety of ways, such as for diagnostic purposes, to identify homologs, and to 
detect, quantitate, or isolate a polynucleotide of the present invention in a test sample. 

Assa^ can be utilized which permit quantification and/or presaice/absence detection 
of a target nucldc add in a sample. Assays can be performed at the single-cell level, or in a 

30 sample comprising many cells, where the assay is "averaging' expression over the entire 
collection of cells and tissue present in the sample. Any suitable assay format can be used. 
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including but not limited to, e.g.. Southern blot analysis. Northern blot analysis, polymerase 
chain reaction ('TCR") (e.g., Saild et al., Science, 241:53, 1988; U.S. Pat. Nos. 4,683,195, 
4,683,202, and 6,040,166; PCR Protocols: A Guide to Methods and Applications, limis et al., 
eds.. Academic Press, New York, 1990), reverse transcriptase polymorase chaui reaction 

5 ("RT-PCR»0, anchored PGR, rapid amplification of cDNA «ids CRACE") (e.g., Schaefer in 
Gene Cloning and Analysis: Current Innovations, Pages 99-115, 1997), ligase chain reaction 
("LCR") (EP 320 308), one-sided PCR (Ohara et al., Proc. Natl Acad. Sci., 86:5673-5677, 
1989), indexing metiiods (e.g., U.S. Pat. No. 5,508,169), in situ hybridization, differential 
display (e.g., Liang et ai., NucL Acid. Res., 21:3269-3275, 1993; U.S. Pat. Nos. 5,262,311, 

10 5,599,672 and 5,965,409; W097/18454; Prashar and Wdssman, Proc. Natl. Acad Sci., 
93:659-663, and U.S. Pat Nos. 6,010,850 and 5,712,126; Welsh et al.. Nucleic Acid Res., 
20:4965-4970, 1992, and U.S. Pat No. 5,487,985) and other RNA fingerprinting tedmiques, 
nucleic add sequence based amplification (**NASB A") and other transcription based 
amplification systems (e.g., U.S. Pat. Nos. 5,409,818 and 5,554,527; WO 88/10315), 

15 polynucleotide arrays (e.g., U.S. Pat Nos. 5,143,854, 5,424,186; 5,700,637, 5,874,219, and 
6,054,270; PCT WO 92/10092; PCT WO 90/15070), Qbeta Replicase (PCT/US87/00880), 
Strand Displacement Amplification ("SDA"), Repair Chain Reaction C*RCR"), nuclease 
protection assays, subtraction-based methods, Rapid-Scan™, etc. Additional useful methods 
include, but are not limited to, e.g., template-based amplification methods, competitive PGR 

20 (e.g., U.S. Pat No. 5,747,251), redox-based assays (e.g., U.S. Pat. No. 5,871,918), Taqman- 
based assays (e.g., Holland et al., Proc. Natl Acad. Sci., 88:7276-7280, 1991; U.S. Pat. Nos. 
5,210,015 and 5,994,063), real-time fluorescence-based monitoring (e.g., U.S. Pat 
5,928,907), molecular energy transfer labels (e.g., U.S. Pat Nos. 5,348,853, 5,532,129, 
5,565,322, 6,030,787, and 6,1 17,635; Tyagj and Kramer, Nature Biotech., 14:303-309, 

25 1996). Any method suitable for single cell analysis of gene or protein expression can be 

used, including in situ hybridi2ation, immunocytochemistry, MACS, FACS, flow cytometry, 
etc. For single cell assays, expression products can be measiired using antibodies, PCR, or 
other types of nucleic acid amplification (e.g., Brady et al., Methods Mol & Cell Biol. 2, 17- 
25, 1990; Eberwine et al., 1992, Proc. Natl Acad Sci., 89, 3010-3014, 1992; U.S. Pat No. 

30 5,723,290). Theseandothermefliodscanbecarriedoutconventioiially,e.g., as described in 
the moitioned publications. 
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Many of such mefhods may require that the polynucleotide is labeled, or comprises a 
particular micleotide type useful for detection. The present invration includes such modified 
polynucleotides that are necessary to carry out such methods. Thus, polynucleotides can be 
DNA, RNA, DNA:RNA hybrids, PNA, etc., and can comprise any modification or 

S substituent which is effective to achieve detection. 

Detection can be desirable for a variety of different purposes, including research, 
diagnostic, prognostic, and forensic. For diagnostic purposes, it maybe desirable to identify 
the presence or quantity of a polynucleotide sequence in a sample, where the sample is 
obtained from tissue, cells, body fluids, etc. In a preferred method as described in more 

10 detail below, the present invention relates to a method of detecting a polynucleotide 

comprising, contacting a target polynucleotide in a test sample with a polynucleotide probe 
under conditions effective to achieve hybridization between the target and probe; and 
detecting hybridization. 

Any test sample in which it is desired to identify a polynucleotide or polypeptide 

1 5 thereof can be used, including, e.g., blood, urine, saliva, stool (for extracting nucleic add, 
see, e.g., U.S. Pat No. 6,177,251), swabs comprising tissue, biopsied tissue, tissue sections, 
cultured cells, etc. 

Polynucleotides can be used in wide range of methods and compositions, including 
for detecting, diagnosing, staging, grading, assessing, prognosticating, etc. diseases and 

20 disorders associated with dififCTentially regulated cancer genes, for monitoring or assessing 
therapeutic and/or preventative measures, in ordered arrays, etc. Any method of detecting 
genes and polynucleotides can be used; certainly, the present invention is not to be limited 
how such methods are implemented. 

Along these lines, the present invention relates to methods of detecting differentially 

25 regulated cancer genes in a sample comprising nucleic acid. Such methods can comprise one 
or more the following steps in any effective order, e.g., contacting said sample witii a 
polynucleotide probe under conditions effective for said probe to hybridize specifically to 
nucleic acid in said sample, and detecting the presence or absence of probe hybridized to 
nucleic acid in said sample, wherein said probe is a polynucleotide which is selected firom 

30 SEQ ID NOS 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 
60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and/or 99 , a polynucleotide having, e.g.. 
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about 70%, 80%, 85%, 90%, 95%, 99%, or more sequence identity thereto, effective or 
specific fiagments tliereo:^ or complements thereto. The detection melhod can be applied to 
any sample, e.g., cultured primary, secondary, or established cell lines, tissue biopsy, blood, 
urine, stool, cerebral spinal fluid, and other bodily fluids, for any purpose. 

5 Contacting the sample with probe can be carried out by any effective means in any 

effective environment. It can be accomplished in a solid, liquid, frozen, gaseous, amorphous, 
solidified, coagulated, colloid, etc., mixtures thereof, matrix. For instance, a probe in an 
aqueous medium can be contacted with a sample which is also in an aqueous medium, or 
which is affixed to a solid matrix, or vice-versa, 

10 Generally, as used throughout the specification, the term "effective conditions'* 

means, e.g., the particular milieu in which the desired effect is achieved. Such a mdlieu, 
includes, e.g., appropriate buffers, oxidizing agents, reducing agents, pH, co-fectors, 
temperature, ion concentrations, suitable age and/or stage of cell (such as, in particular part of 
the cell cycle, or at a particular stage where particular genes are being expressed) where cells 

15 are being used, culture conditions (including substrate, oxygen, carbon dioxide, etc.). When 
hybridization is the chosen means of achieving detection, the probe and sample can be 
combined such that the resulting conditions are functional for said probe to hybridize 
specifically to nucleic acid in said sample. 

The phrase '*hybridize specifically" indicates that the hybridization between single- 

20 stranded polynucleotides is based on nucleotide sequence complementarity. The effective 
conditions are selected such that the probe hybridizes to a preselected and/or definite target 
nucleic add in the sample. For instance, if detection of a polynucleotide set forth in SEQ ID 
NOS 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 
64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and/or 99 is desired, a probe can be selected which 

25 can hybridize to such target gene imder high stringent conditions, without significant 

hybridization to other genes in the sample. To detect homologs of a polynucleotide set forth 
in SEQ ID NOS 1, 3, 5, 7, 9, 11,13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 
58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and/or 99, the effective hybridization 
conditions can be less stringent, and/or the probe can comprise codon degeneracy, such that a 

30 homolog is detected in the sample. 
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As already mentioned, the methods can be carried out by any effective process, e.g., 
by Northem blot analysis, polymerase chain reaction (PGR), reverse transcriptase PGR, 
RAGE PGR, in situ hybridization, etc., as indicated above. When PGR based techniques are 
used, two or more probes are generally used* One probe can be specific for a defined 

5 sequence which is characteristic of a selective polynucleotide, but the other probe can be 
specific for the selective polynucleotide, or specific for a more general sequence, e.g., a 
sequence such as poIyA which is characteristic of mRNA, a sequence which is specific for a 
promoter, ribosome binding site, or other transcriptional features, a consensus sequence (e.g., 
representing a functional domain). For the former aspects, 5' and 3' probes (e.g., polyA, 

10 Kozak, etc.) are preferred which are capable of specifically hybridizing to the ends of 

tianscripts. When PGR is utilized, the probes can also be referred to as "primers" in that they 
can prime a DNA polymerase reaction. 

In addition to testing for the presence or absence of polynucleotides, the present 
invention also relates to determining the amounts at which polynucleotides of the present 

15 invention are expressed in sample and deterniining the diff^ential expression of such 
polynucleotides in samples.. Such methods can involve substantially the same steps as 
described above for presence/absence detection, e.g., contacting with probe, hybridizing, and 
detecting hybridized probe, but using more quantitative methods and/or comparisons to 
standards. 

20 The amount of hybridization between the probe and target can be determined by any 

suitable methods, e.g., PGR, RT-PGR, RAGE PGR, Northem blot, polynucleotide 
miax>arra>^, Rapid-Scan, etc., and includes both quantitative and qualitative measurements. 
For fiirther details, see the hybridization methods described above and below. Determining 
by such hybridization whether tiie target is differentially expressed (e.g., up-regulated or 

25 down-regulated) in the sample can also be accomplished by any effective means. For 

instance, the target's expression pattern in the sample can be compared to its pattern in a 
known standard, such as in a normal tissue, or it can be compared to another gene in the same 
sample. When a second sample is utilized for the comparison, it can be a sample of normal 
tissue that is known not to contain diseased cells. The comparison can be performed on 

30 samples which contain the same amount of RNA (such as polyadenylated RNA or total 
RNA), or, on RNA extracted fi-om the same amounts of starting tissue. Such a second 
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sample can also be referred to as a control or standard. Hybridization can also be compared 
to a second target in tbe same tissue sample. Experiments can be performed that determine a 
ratio between the target nucleic add and a second nucleic add (a standard or control) , e,g-, in 
a normal tissue. When the ratio between the target and control are substantially the same in a 

5 normal and sample, fbe sample is determined or diagnosed not to contain cells. However, if 
the ratio is different between the normal and sample tissues, the sample is determined to 
contain cancer cells. The approaches can be combined, and one or more second samples, or 
second targets can be used. Any second target nucleic acid can be used as a comparison, 
including 'liousekeepingf* genes, such as beta-actin, alcohol dehydrogenase, or any other 

10 g^e whose expression does not vary depending upon the disease status of the cell. 

Methods of identifying polymorphisms, mutations, etc., of differentially regulated cancer 
genes 

Polynucleotides of the preset invention can also be utilized to identify mutant alleles, 

1 5 SNPs, gene rearrangements and modifications, and other polymorphisms of the wild-type 
gene. Mutant alleles, polymorphisms, SNPs, etc., can be identified and isolated ftom 
subjects with diseases that are known, or suspected to have, a genetic component. 
Idaitification of such genes can be carried out routinely (see, above for more guidance), e.g., 
using PGR, hybridization techniques, direct sequencing, mismatch reactions (see, e.g., 

20 above), RFLP analysis, SSCP (e.g., Qrita et al., Proc. Natl. Acad. Set, 86:2766, 1992), etc., 
where a polynucleotide having a sequ^ce selected jfrom SEQ ID NOS 1, 3, 5, 7, 9, 1 1, 13, 
15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 
85, 86, 88, 95, and/or 99 is used as a probe. The selected mutant alleles, SNPs, 
polymorphisms, etc., can be used diagnostically to determine whether a subject has, or is 

25 susceptible to a disorder associated with differentially regulated cancer genes, as well as to 
design therapies and predict the outcome of the disorder. Methods involve, e.g., diagnosing a 
disorder associated with differentially regulated cancer genes or determining susceptibility to 
a disorder, comprising, detecting the presence of a mutation in a gene represented by a 
polynucleotide selected fix>m SEQ ID NOS 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 

30 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and/or 99. 

The detecting can be carried out by any effective method, e.g., obtaining cells fix)m a subject. 
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determining the gene sequence or structure of a target gene (using, e.g., mRNA, cDNA, 
genomic DNA, etc), comparing the sequence or structure of the target gene to the structure of 
the normal geae^ whereby a difiference in sequence or structure indicates a mutation in the 
gene in the subject. Polynucleotides can also be used to test for mutations, SNPs, 
5 polymorphisms, etc., e.g., using mismatch DNA repair technology as described in U.S. Pat 
No. 5,683,877; U.S. Pat No. 5,656,430; Wu et al., Proc. Natl Acad ScL, 89:8779-8783, 
1992. 

The present invention also relates to methods of detecting polymorphisms in 
differentially regulated cancer genes, comprising, e.g., comparing the structure of: genomic 

10 DNA comprising all or part of a differentially regulated cancer gene, mRNA comprising all 
or part of a differentially regulated cancer gene, cDNA comprising all or part of a 
differratially regulated cancer gene, or a polypeptide comprising all or part of differentially 
regulated cancer gene, with the structure of a differentially regulated cancer gene,e.g., as set 
forth in SEQ ID NOS 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 

15 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and/or 99. The metiiods can be 
carried out on a sample from any soxjrce, e.g., cells, tissues, body fluids, blood, urine, stool, 
hair, egg, spenn,cerebral spinal fluid, etc. 

These methods can be implemented in many different ways. For example, 
"comparing the structure" steps include, but are not limited to, comparing restriction maps, 

20 nucleotide sequences, amino acid sequences, RFLPs, DNAase sites, DNA meth}iation 
fingerprints (e.g., U.S. Pat No. 6,214,556), protehi cleavage sites, molecular wei^ts, 
electrophoietic mobilities, charges, ion mobilily, etc., between a standard differentially 
regulated cancer g^es and a test differentially regulated cancer genes. The term "structure" 
can refer to any physical characteristics or configurations which can be used to distinguish 

25 between nucleic acids and polypeptides. The methods and instruments used to accomplish 
the comparing step depends upon the physical characteristics which are to be compared. 
Thus, various techniques are contemplated, including, e.g., sequencing machines (both amino 
acid and polynucleotide), electrophoresis, mass spectrometer (U.S. Pat Nos. 6,093,541, 
6,002,127), Uquid chromatography, HPLC, etc. 

30 To cany out such methods, "all or part*' of the gene or polypeptide can be compared. 

For example, if nucleotide sequencing is utilized, the entire gene can be sequenced, including 
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promoter, introns, and exons, or only parts of it can be sequmced and compared, e.g., exon 1, 
exon2, etc. 

Mutagenesis 

Mutated polynucleotide sequences of the present invention are useful for various 
purposes, e.g., to create mutations of the polypeptides they encode, to identify functional 
regions of genomic DNA, to produce probes for screening libraries, etc. Mutagenesis can be 
carried out routinely according to any effective method, e.g., oligonucleotide-directed (Smith, 
M.,Am. Rev. Gewet 19:423-463, 1985), degenerate oUgonucleotide-directed (Hill et al., 
Method Efizymology, 155:558-568, 1987), region-specific (Myers et al.. Science, 229:242- 
246, 1985; Derbyshire et al.. Gene, 46:145, 1986; Ner et al., DNA, 7:127, 1988), linker- 
scanning (McKnight and Kingsbury, Science, 217:316-324, 1982), directed using PGR, 
recursive ensemble mutagenesis (Arldn and Yourvan, Proc. Natl. Acad. Sci., 89:781 1-7815, 
1992), random mutagenesis (e.g., U.S. Pat. Nos. 5,096,815; 5,198,346; and 5,223,409), site- 
directed mutagenesis (e.g., Walder et al.. Gene, 42:133, 1986; Bauer et al.. Gene, 37.73, 
1985; Craik, Bio Techniques, January 1985, 12-19; Smith et al.. Genetic Engineering: 
Principles and Methods, Plenum Press, 1981), phage display (e.g., Lowman et al., Biochem. 
30:10832-10837, 1991; Ladner et al., U.S. Pat. No. 5,223,409; Huse, WIPO Publication WO 
92/06204), etc. Desired sequences can also be produced by the assembly of target sequences 
using mutually priming oUgonucleotides (Uhlmann, Gene, 71:29-40, 1988). For directed 
mutagenesis me&ods, amdysis of the fliree-dimensional structure of the differentiaUy 
regulated cancer genes polypeptide can be used to guide and facilitate making mutants whidi 
effect polypeptide activity. Sites of substrate-enzyme interaction or oflier biological activities 
can also be determined by analysis of crystal structure as determined by such techniques as 
nuclear magnetic resonance, crystallography or photoaffinity labeling. See, for example, de 
Vos et al.. Science 255:306-312, 1992; Smith et al., J. Mol. Biol. 224:899-904, 1992; 
Wlodaver et al., FEBS Lett. 309:59-64, 1992. 

Id. addition, libraries of differentially regulated cancer genes and fiagments Uiereof 
can be used for screening and selection of differentially regulated cancer genes variants. For 
instancy a library of coding sequences can be generated by treating a double-stranded DNA 
with a nuclease under conditions where the nicking occurs, e.g., only once per molecule. 
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denaturing the double-Stranded DNA, resnatu^ 

include sense/antisense paiis from different nidced products, removing single-stranded 
portions from reformed duplexes by treatment with SI nuclease, and ligating the resulting 
DNAs into an expression vector. By this method, expression libraries can be made 
5 comprising '*mutageni2ed"differentiaUy regulated cancer gOT The entire coding sequence 
or parts thereof can be used. 

Polynucleotide expression, polypeptides produced thereby, and specific-binding partners 
thereto. 

10 A polynucleotide according to the present invention can be expressed in a variety of 

differmt systems, in vitro and in vivo, according to the desired purpose. For example, a 
polynucleotide can be inserted into an expression vector, introduced into a desired host, and 
cultured under conditions effective to achieve expression of a polypeptide coded for by the 
polynucleotide, to search for specific binding partners. Effective conditions include any 

1 5 culture conditions which are suitable for achieving production of the polypeptide by the host 
cell, including effective temperatures, pH, medium, additives to the media in which the host 
cell is cultured (e.g., additives which amplify or induce expression such as butyrate, or 
methotrexate if the coding polynucleotide is adjacent to a dhfr gene), cycloheximide, cell 
densities, culture dishes, etc. A polynucleotide can be introduced into the cell by any 

20 effective method including, e.g., naked DNA, calcium phosphate precipitation, 

electroporation, injection, DEAE-Dextran mediated transfection, fiision with liposomes, 
association with ageats which enhance its uptake into cells, viral transfection. A cell into 
which a polynucleotide of the present invention has been introduced is a transformed host 
cell. The polynucleotide can be extrachromosomal or integrated into a chromosome(s) of the 

25 host cell. It can be stable or transient. An expression vector is selected for its compatibility 
with the host cell. Host cells include, mammalian cells, e.g., COS, CVl, BHBC, CHO, HeLa, 
LTK,NIH3T3, 

PC-3 (CRL-1435), LNCaP (CRL-1740), CA-HPV-10 (CRI^2220), PZ-HPV-7 (CRI^ 
2221), MDA-PCa 2b (CRL-2422), 22Rvl (CRL2505), NCI-H660 (CRL-5813), HS 804.Sk 
30 (CRI^7535), LNCaP-FGF (CRL-10995), RWPE-1 (CRL-l 1609), RWPE-2 (CRL-1 1610), 
PWR-IE (CRL 1 161 1), rat MAT-Ly-LuB-2 (CRL-2376), and other primary and established 
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prostate and prostate cancer cell lines, ZR-75-1 (ATCC CRL-ISOO), ZR-75-30 (ATCC CRL- 
1504), UACC-812 (ATCC CRI^1897), UACC-893 (ATCC CRL-1902), HCC38 (ATCC 
CRL-23 14), HCC70 (CRI^23 1 5), and other HCC cell lines (e.g., as deposited with the 
ATCC), AU565 (ATCC CRL-2351), Hs 496.T (ATCC CRL-7303), Hs 748.T (ATCC CRL- 
5 7486), SW527 (ATCC CRL.7940), 184A1 (ATCC CRI^8798), MCF cell lines (e.g., lOA 
and others deposited with the ATCC), MDA-MB-134-VI (ATCC HTB-23 and other MDA 
cell lines), SK-BR-3 (ATCC HTB-30), ME-180 (ATCC HTB-33), Hs 578Bst (ATCC HTB- 
125), Hs 578T (ATCC HTB-126), T-47D (ATCC HTB-133), and other primary and 
established breast and breast cancer cell lines, insect cells, such as S© (S. frugipeda) and 
10 Drosophila, bacteria, such as E. coli. Streptococcus, baciUus, yeast, such as Sacharomyces, S. 
cerevisiae, fungal cells, plant cells, embryonic or adult stem cells (e.g., mammalian, such as 
mouse or human). 

Expression control sequences are similarly selected for host compatibility and a 
desired purpose, e.g., high copy number, high amounts, induction, amplification, controlled 

15 expression. Other sequences which can be employed include enhancers such as from SV40, 
CMV, RSV, inducible promoters, cell-type specific elements, or sequences which allow 
selective or specific cell expression. Promoters that can be used to drive its expression, 
include, e.g., the endogenous promoter, MMTV, SV40, trp, lac, tac, or T7 promoters for 
bacterial hosts; or alpha factor, alcohol oxidase, or PGH promoters for yeast RNA 

20 promoters can be used to produced RNA transcripts, such as T7 or SP6. See, e.g.. Melton et 
al.. Polynucleotide Res,, 12(18):7035-7056, 1984; Dunn and Studier. /. MoL Bio., 166:477- 
435, 1984; U.S. Pat. No. 5,891,636; Studier et al.. Gene Expression Technology, Methods in 
Enzymology, 85:60-89, 1987. In addition, as discussed above, translational signals (including 
in-fi:ame insertions) can be included. 

25 When a polynucleotide is expressed as a heterologous gene in a transfected cell line, 

the gene is introduced into a cell as described above, under effective conditions in which the 
gene is expressed. The term *lieterologous" means that the gene has been introduced into the 
cell line by the **hand-of-man." Introduction of a gene into a cell line is discussed above. 
The transfected (or transformed) cell expressing the gene can be lysed or the cell line can be 

30 used intact 
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For expression and other purposes, a polynucleotide can contain codons found in a 
naturally-occurring gene, transcript, or cDNA, for example, e.g., as set forfli in SEQ ID NOS 
1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 64, 
66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and/or 99, or it can contain degenerate codons coding 
5 for the same amino add sequences. For instance, it may be desirable to change the codons in 
the sequence to optimize the sequence for expression in a desired host. See, e.g., U.S. Pat 
Nos. 5,567,600 and 5,567,862. 

A polypeptide according to the present invention can be recovered from natural 
sources, transformed host cells (culture medium or cells) according to the usual methods, 

10 including, detergent extraction (e.g., non-ionic detergent, Triton X-100, CHAPS, 

octylglucoside, Igepal CA-630), ammonium sul&te or ethanol precipitation, add extraction, 
anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic 
. interaction chromatography, hydroxyapatite chromatography, lectin chromatogr^hy, gel 
electrophoresis. Protein refolding steps can be used, as necessary, in completing the 

1 5 configuration of the mature protein. Finally, high performance liquid chromatography 

(HPLC) can be employed for purification steps. Another approach is express the polypeptide 
recombinantly with an aflSnity tag (Flag epitope, HA epitope, myc epitope, 6xHis, maltose 
binding protein, chitinase, etc) and then purify by anti-tag antibody-conjugated affinity 
chromatogr^hy. 

20 The present invention also relates to polyp^tides of differentially regulated cancer 

genes, e.g., an isolated himian differentially regulated cancer gene polypeptide comprising or 
having the amino add sequence set forth in SEQ ID NOS 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 
23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, 
and/or 99, an isolated mammalian differentially regulated cancer genes polypeptide 

25 comprising an amino acid sequence, e.g., having at least 90%, 95%, 99%, or more amino acid 
sequence identity to the amino acid sequence set forth in SEQ ID NOS 2, 4, 6, 8, 10, 12, 14, 
16, 18, 20, 22, 24, 26, 28, 30, 46, 51, 49, 53, 55, 57, 59, 61,63, 65, 67, 69, 71, 73, 75, 87, 89, 
96 and/or 100, and optionally having one or more of differentially regulated cancer genes 
activities. Fragmmts specific to differentially regulated cancer genes can also used, e.g., to 

30 produce antibodies or other immune responses, as competitors, etc. These fi:agments can be 
referred to as being "specific for*' a differentially regulated cancer gene. The latter phrase, as 
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akeady defined, indicates that the peptides are characteristic of a particular geaes, and that 
the defined sequences are substantially sbs&at fixwn all other protein ^^pes. Such 
polypeptides can be of any size which is necessary to confer specificity, e.g., 5, 8, 10, 12, 15, 
20, etc. 

The present invention also relates to specifio-bindiDg partners. These include 
antibodies which are specific for polypeptides encoded by polynucleotides of the present 
invention, as well as other binding-partners which interact with polynucleotides and 
polypeptides of tiie present invention. Protein-protein interactions between differaitially 
regulated canc^ genes and other polypeptides and binding partners can be identified using 
any suitable methods, e.g., protein binding assays (e.g., filtration assays, chromatography, 
etc.) , yeast two-hybrid system (Fields and Song, Nature, 340: 245-247, 1989), protein arrays, 
gel-shift assays, FRET (fluorescence resonance energy transfo) assays, etc. Nucleic add 
interactions (e.g., protein-DNA or protein-RNA) can be assessed using gel-shift assays, e.g., 
as carried out in U.S. Pat No. 6,333,407 and 5,789,538. 

Antibodies, e.g., polyclonal, monoclonal, recombinant, chimeric, humanized, single- 
chain. Fab, and fiagments thereof can be prepared according to any desired method. See, 
also, screening recombinant immunoglobulin libraries (e.g., Orlandi et al., Proc. Natl. Acad. 
ScL, 86:3833-3837, 1989; Huse et al.. Science, 256:1275-1281, 1989); in vitro stimulation of 
lymphocyte populations; Winter and Milstein, Nature, 349: 293-299, 1991. The antibodies 
can be IgM, IgG, subtypes, IgG2a, IgGl, etc. Antibodies, and immune responses, can also be 
generated by administering naked DNA See, e.g., U.S. Pat Nos. 5,703,055; 5,589,466; 
5,580,859. Antibodies can be used from any source, including, goat, rabbit, mouse, diicken 
.(e.g., IgY; see, Duan, WO/029444 for methods of making antibodies in avian hosts, and 
harvesting the antibodies firom the eggs). An antibody specific for a polypeptide means that 
the antibody recognizes a defined sequence of amino acids within or including tiie 
polypeptide. Other spedfic binding partners include, e.g., aptamers and PNA. Antibodies 
can be prepared against specific epitopes or domains of a differentially regulated cancer gene. 
Antibodies can also be humanized, e.g., ^ere they are to be used therapeutically. 

The tam "antibody" as used herein includes intact molecules as well as firagments 
thereof, such as Fab, F(ab')2, and Fv which are enable of binding to an epitopic determinant 
present in Binl polypeptide. Sudi antibody firagments retain some ability to selectively bind 
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wifh its antigCT or receptor. . The term ^'epitope" refers to an antigenic detenmnant on an 
antigen to which the paratope of an antibody binds. Epitopic detCTninants usually consist of 
chemically active surface groupings of molecules such as amino acids or sugar side chains 
and usually have specific three dimensional structural characteristics, as well as specific 
5 charge characteristics. Antibodies can be prepared against specific epitopes or polypeptide 
domains. 

Antibodies which bind to differentially regulated cancer genes polypeptides of the 
present invention can be prepared using an intact polypeptide or fragments containing small 
peptides of interest as the immunizing antigen. For example, it may be desirable to produce 

1 0 antibodies that specifically bind to the N- or C-terminal domains of differentially regulated 
cancer genes. The polypeptide or peptide used to immunize an animal which is derived firom 
translated cDNA or chemically synthesized which can be conjugated to a carrier protein, if 
desired. Such conmionly used carriers which are chemically coupled to the immunizing 
peptide include keyhole limpet hemocyanin (KLH), thyroglobulin, bovine serum albumin 

1 5 (BS A), and tetanus toxoid. 

Methods of detecting polypeptides 

Polype|>tides coded for by diff^entially regulated cancer genes of the present 
invention can be detected, visualized, determined, quantitated, etc. according to any effective 

20 method, usefiil methods include, e.g., but are not limited to, immunoassa}^, RIA 

(radioimmunassay), ELISA, (enzyme-linked-immxmosorbent assay), inmnmoflourescence, 
flow, cytometry, histology, electron microscopy, light nuoroscopy, in situ assays, 
immmoprecipitation. Western blot, etc. 

Immunoassays may be carried in liquid or on biological support. For instance, a 

25 sample (e.g., blood, stool, urine, cells, tissue,cerebral spinal fluid, body fluids, etc.) can be 
brought in contact with and immobilized onto a solid phase support or carrier such as 
nitrocellulose, or other solid support that is capable of inomobilizing cells, cell particles or 
soluble proteins. The siq>port may then be washed with suitable buflers followed by 
treatment with the detectably labeled differentially regulated cancer genes specific antibody. 

30 The solid phase support can then be washed with a buffer a second time to remove unbound 
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antibody. The amount of bound label on solid support may then be detected by conventional 
. means. 

A "solid phase support or catria" includes any support capable of binding an antigen, 
antibody, or other specific binding partner. Supports or carriers include glass, polystyrene, 

5 polypropylene, polyethylene, dextran, nylon, amjdases, natural and modified celluloses, 
polyacrylamides, and magnetite. A siqiport material can have any structural or ph)^cal 
configuration. Thus, the support configuration maybe spherical, as in a bead, or cylindrical, 
as in the inside surface of a test tube, or the external surface of a rod. Alternatively, the 
surfece maybe flat such as a sheet, test strip, etc. 

10 One of the many ways in which gene peptide-spedfic antibody can be detectably 

labeled is by linking it to an enzyme and using it in an enzyme immunoassay (EIA). See, 
e.g., Voller, A, **The Enzyme Linked Immunosorbent Assay (ELISA)," 1978, Diagnostic 
Horizons 2, 1-7, Microbiological Associates Quarterly Publication, Walkersville, Md.); 
Voller, A. et al., 1978, J. Clin. Pathol. 31, 507-520; Butier, J. E., 1981, Meth. Enztymol. 73, 

15 482-523; Maggio, E. (ed.), 1980, Enzyme Immunoassay, CRC Press, Boca Raton, Fla.. The 
enzyme which is bound to the antibody will react with an appropriate substrate, preferably a 
chromogenic substrate, in such a manner as to produce a chemical moiety that can be 
detected, for example, by spectrophotometric, fluorimetric or by visual means. Enzymes that 
gan be used to detectably label the antibody include, but are not limited to, malate 

20 dehydrogenase, staph^dococcal nuclease, delta-5-steroid isomorase, yeast alcohol 

ddiydrogeiMse, .alpha.-glycetophosphate, dehydrogenase, triose phosphate isomerase, 
horseradish peroxidase, alkaline phosphatase, asparaginase, glucose oxidase, .beta.- 
galactosidase, ribonuclease^ urease, catalase, glucose-6-phosphate ddiydrogenase, 
glucoamjdase and acetylcholinesterase. The detection can be accomplished by colorimetilc 

25 methods that employ a chromogenic substrate fi)r the enzyme. Detection may also be 
accomplished by visual comparison of the extent of enzymatic reaction of a substrate in 
comparison witii similarly prepared standards. 

Detection may also be accomplished using any of a variety of other immunoassays. 
For example, by radioactively labeling the antibodies or antibody firagments, it is possible to 

30 detect difierentially regulated cancer genes peptides flirough the use of a radioimmunoassay 
(RIA). See, e.g., Weintraub, B., Principles of Radioimmunoassays, Seveath Training Course 
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on Radioligand Assay Techniques, The Endocrine Society, March, 1 986. The radioactive 
isotope can be detected by such means as the use of a gamma counter or a scmtillation 
counter or by autoradiography. 

It is also possible to label ttie antibody witti a fluorescent compound. When the 
fluorescently labeled antibody is exposed to light of the proper wave length, its presence can 
then be detected due to fluorescence. Among the most conunonly used fluorescmt labeling 
compoimds are fluorescein isothiocyanate, rhodandne, phycoerythrin, phycocyanin, 
allophycocyanin, o-phthaldehyde and fluorescamine. The antibody can also be detectably 
labeled using fluorescence emitting metals such as those in the lanflianide series. These 
metals can be attached to the antibody using such metal chelating groups as 
diethylenetciaminepentacetic add (DTPA) or ethylenediaminetetraacetic add (EDTA). 

The antibody also can be detectably labeled by coupling it to a chemiluminescent 
compound. The presence of the chemiluminescent-tagged antibody is then determined by 
detecting the presence of luminescence that arises during the course of a chemical reaction. 
Examples of usefiil chemiluminescent labeling compounds are luminol, isokiminol, 
theromatic acridinium ester, imidazole, acridinium salt and oxalate ester. 

Likewise, a bioluminescent compound may be used to label the antibody of the 
present invention. Bioluminescence is a type of chemilimiinescence found in biological 
systems in whidi a catalytic protein increases the effidency of the chemiluminescent 
reaction. The presence of a biolmninescent protein is determined by detecting the presence of 
luminescence. Important bioluminescent compounds for purposes of labeling are ludferin, 
luciferase and aequorin. 

Diagnostic 

The present invention also relates to methods and compositions for diagnosing 
prostate or breast cancer, or detennining susceptibility to said cancer, using polynucleotides, 
polypeptides, and specific-binding partners of the present invention to detect, assess, 
determine, etc., the expression of differentially regulated cancer genes and their polypeptide 
products, hi such methods, the gene can serve as a marker for the disorder, e.g., where the 
gene, when mutant, is a direct cause of the disorder, where the gene is affected by another 
gene(s) which is direcfly responsible for the disorder, e.g., when the gene is part of the same 



wo 03/064599 




PCT/US03/01943 



-4d- 

signaling pathway as the directly responsible gene; and, where the gene is chromosomally 
linked to the gene(s) directly responsible for the disorder, and segregates with it. Many other 
situations are possible. To detect, assess, determine, etc., a probe specific for the gene can be 
employed as described above and below. Any method of detecting and/or assessing the g^e 
S can be used, including detecting expression of the gene using polynucleotides, antibodies, or 
other specific-binding partners. 

The present invention relates to methods of diagnosing a cancer associated with a 
diflFerentially regulated cancer gene of the present invention, or determining a subject's 
susceptibility to such cancer, comprismg, e.g., assessing the expression of a gene of the 

10 present invention in a tissue sample comprising tissue or cells suspected of having cancer. 
The phrase "'diagnosing^ ' indicates that it is determined whether the sample has the disoido:. 
A ""disorder" means, e.g., any abnormal condition as in a disease or malady. ""Determining a 
subject's susceptibility to a disease or disorder" indicates that the subject is assessed for 
whether s/he is predisposed to get such a disease or disorder, where the predisposition is 

1 S indicated by abnormal ^pression of the gene (e.g., gene mutation, gene expression pattern is 
not normal, etc.). Predisposition or susceptibility to a disease may result when a such disease 
is influenced by epigenetic, enviroimiental, etc., factors. Diagnosing includes prenatal 
screening where samples firom the fetus or embryo (e.g., via amniocentesis or CV sampling) 
are analyzed for the expression of the gene. 

20 By the phrase ""assessing expression of a differentially regulated gene," it is meant 

that the fimctional status of the gene is evaluated. This includes, but is not limited to, 
measuring expression levels of said gene, determining the genomic structure of said gene, 
determining the mRNA structure of transcripts from said gene, or measuring the expression 
levels of polypeptide coded for by said g^e. Thus, the term ""assessing expression" includes 

25 evaluating the all aspects of the transcriptional and translational machinery of the gene. For 
instance, if a promoter defect causes, or is suspected of causing, the disorder, then a sample 
can be evaluated (i.e., ""assessed") by looking (e.g., sequencing or restriction mapping) at the 
promoter sequence in the gene, by detecting transcription products (e.g., RNA), by detecting 
translation product (e.g., polypeptide). Any measure of whether the gene is fimctional can be 

30 used, including, polypeptide, polynucleotide, and fimctional assaj^ for the gene's biological 
activity. 



wo 03/064599 




PCT/US03/01943 



In making the assessment, it can be useful to compare the results to a normal gene, 
e.g., a gene which is not associated with the disorder. The nature of the comparison can be 
detennined routinely, depending upon how the assessing is accomplished. If, for example, 
the mRNA levels of a sample is detected, then the mRNA levels of a normal can serve as a 
5 comparison, or a gene which is known not to be affected by fhe disorder. Methods of 

detecting mRNA are well known, and discussed above, e.g., but not limited to, Norfhem blot 
analj^is, polymerase chain reaction (PGR), reverse transcriptase PGR, RAGE PGR, etc. 
Similarly, if polypeptide production is used to evaluate the gene, then the polypeptide in a 
normal tissue sample can be used as a comparison, or, polypeptide from a different gesae 

1 0 whose expression is known not to be affected by the disorder. These are only examples of 
how such a method could be carried out. 

The genes and polypeptides of the present invention can be used to identify, detect, 
stage, determine the presence of, prognosticate, treat, study, etc., breast, prostate, and other 
cancer. The present invention relates to methods of identifying a genetic basis for a disease 

IS or disease-susceptibility, comprising, e.g., determining the association of a cancer or cancer 
susceptibility with a gene of the present invention. An association between a disease or 
disease-susceptibility and nucleotide sequence includes, e.g., establishing (or finding) a 
correlation (or relationship) between a DNA marker (e.g., gene, VNTR, polymorphism, EST, 
etc.) and a particular disease state. Once a relationship is identified, the DNA marker can be 

20 utilized in diagnostic tests and as a drug target. Any region of the gene can be used as a 
source of the DNA marker, exons, introns, intergenic regions, etc. 

Human linkage maps can be constructed to establish a relationship between a gene 
and cmcer. Typically, polymorphic molecular markers (e.g., STRP's, SNP's, RFLP's, 
VNTR's) are identified within the region, linkage and m^ distance between the maiicers is 

25 then established, and then linkage is established between phenotype and the various 

individual taolecular markers. Maps can be produced for an individual family, selected 
populations, patient populations, etc. In general, these methods involve identifying a maiker 
associated with the disease (e.g., identifying a polymorphism in a family which is linked to 
the disease) and then analyzing the surrounding DNA to identify the gene responsible for the 

30 phenotype. See, e.g., Kruglyak et al.. Am. J. Hum. Genet.^ 58, 1347-1363, 1996; Matise et 
al.,Nat Genet., 6(4):384-90, 1994. 



wo 03/064599 




PCTAJS03/01943 



-48- 

Assessing the effects of therapeutic and preventative interventions (e.g., 
administration of a drug, chemothaa^py, radiation, etc.) on cancer is a major effort in drug 
discovery, clinical medidne, and pharmacogenomics. The evaluation of therapeutic and 
preventative measures, whether experimental or already in clinical use, has broad 
applicability, e.g., in clinical trials, for monitoring the status of a patient, for analyzing and 
assessing animal models, and in any scenario involving disease treatment and prevention. 
Analyzing the expression profiles of polynucleotides of the present invention can be utilized 
as a parameter by which interventions are judged and measured. Treatment of a disorder can 
change the expression profile in some manner which is prognostic or indicative of the drug's 
effect on it Changes in the profile can indicate, e.g., drug toxicity, return to a normal level, 
etc. Accordingly, the present invention also relates to methods of monitoring or assessing a 
there^eutic or preventative measure (e.g., chemotherapy, radiation, anti-neoplastic drugs, 
aiatibodies, etc.) in a subject having a cancer, or, susceptible to cancer, comprising, e.g., 
detectmg the expression levels of diflferentially regulated cancer genes* A subject can be a 
cell-based assay system, non-human animal model, human patient, etc. Detecting can be 
accomplished as described for the methods above and below. By •^therapeutic or preventative 
intervention," it is meant, e.g., a drug administered to a patient, surgery, radiation, 
chemotherapy, and other measures taken to prevent, treat, or diagnose a disorder. 

The present invention also relates to methods of \ising differentially regulated cancer 
genes binding partners, such as antibodies, to deUver active agents to the cancer for a variety 
of different purposes, including, e.g., for diagnostic, tiierapeutic (e.g., to treat cancer), and 
research purposes. Methods can involve delivering or administering an active agent to the 
cancer, comprising, e.g., administ^g to a subject in need thereof, an effective amount of an 
active agent coupled to a binding partner specific for human differentially regulated cancer 
genes polypeptide, whereui said binding partner is effective to deliver said active agent 
specifically to said cancer. 

Any type of active agent can be used in combination with a binding partner, 
including, therapeutic, cytotoxic, cytostatic, chemotherapeutic, anti-neoplastic, anti- 
proliferative, anti-biotic, etc., agents. A chemotherapeutic agent can be, e.g., DNA- 
interactive agent, alkylating agent, antimetabolite, tubulin-interactive agent, hormonal agent, 
hydroxyurea, Cisplatin, Cyclophosphamide, Altretamine, Sieomydn, Dactinomydn, 
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Doxorubicin, Etoposide, Teniposide, paclitaxel, Q^oxan, 2-methoxy-carbonyl-amino- 
benzimidazole, Plicamycin, Methotrexate, Fluorouradl, Fluorodeoxyuridin, CB3717, 
Azacitidine, Floxuridine, Mercapyopmine, 6-Thioguanine, Pentostatin, Cytarabine, 
Fludarabine, eta Agents can also be contrast agents useM in imaging technology, e.g., X- 
5 ray, CT, CAT, MRI, ultrasound, PET, SPECT, and sdntographia 

An active agent can be associated in any manner with a differentially regulated cancer 
genes binding partner which is effective to achieve its delivery specijBcally to flie target. 
Specific delivery or targeting indicates that the agent is provided to the cancer, without being 
substantially provided to other tissues. This is useful especially where an agent is toxic, and 

10 specific targeting to the cancer enables the majority of the toxicity to be aimed it, with as 

small as possible effect on other tissues in the body. The association of the active agent and 
the binding partner ("coupling") can be direct, e.g., tiurou^ chemical bonds between the 
binding partner and the agent, or, via a linking agent, or Hxe association can be less direct, 
e.g., where the active agent is in a liposome, or other carrier, and the binding partner is 

1 5 associated with the liposome surface, hi such case, the binding partner can be oriented in 
such a way that it is able to bind to differentially regulated cancer gene product, e.g., on the 
cell surface. Methods for delivery of DNA via a cell-surface receptor is described, e.g., in 
U.S. Pat. No. 6,339,139. 

20 Identifying agent methods 

The present invention also relates to methods of identifying agents, and the agents 
themselves, which modulate a differentially regulated cancer gene. These agents can be used 
to modulate the biological activity of the polypeptide encoded for the gene, or the gene, itself 
Agents which regulate the gene or its product are usefiil in variety of differmt environments, 

25 including as medicinal agents to treat or prevent disorders associated with differentially 
regulated cancer genes and as research reagents to modify the function of tissues and cell. 

Methods of identifying ag^ts genially comprise steps in which an agent is placed in 
contact with the gene, its transcription product, its translation product, or other target, and 
then a determination is performed to assess whether the agent "modulates" the target The 

30 spedfic method utilized will depend upon a number of factors, including, e.g., the target (i.e.. 
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is it ihe gene or polypeptide encoded by it), the environment (e.g., in vitro or in vivo), the 
composition of the agent, etc. 

For modulating tiie expression of a differentially regulated cancer g&ae, a mefliod can 
comprise, in any effective order, one or more of the following steps, e.g., contacting a 
differentially regulated cancer gene (e.g., in a cell population) with a test agent under 
conditions effective for said test agent to modulate flie expression of said differentially 
regulated cancer gene, and determining whether said test agent modulates said differentially 
regulated cancer graie. An agent can modulate expression of differentially regulated canco: 
gene at any level, including transcription (e.g., by modulating the promoter), translation, 
and/or perdurance of the nucleic acid (e.g., degradation, stability, etc.) in the cell. 

For modulating the biological activity of differentially regulated cancer gene product, 
such as a polypeptides, a method can comprise, in any effective order, one or more of flie 
following steps, e.g., contacting a differentially regulated cancer gene polypeptide (e.g., in a 
cell, lysate, or isolated) witii a test agent under conditions effective for said test agent to 
modulate the biological activity of said polypeptide, and determining whether said test agent 
modulates said biological activity. 

Ck)ntacting a differentially regulated cancer gene with the test agent can be 
accomplished by any suitable method and/or means fliat places flie agent in a position to 
functiomdly control expression or biolo^cal activity of the gene present in the sample. 
Functional control indicates that the agent can exert its physiological effect on differentially 
regulated cancer genes through whateva: mechanism it works. The dioice of the method 
and/or means can depend upon the nature of the agent and the condition and type of 
environment in which the differentially regulated cancer genes is presented, e.g., lysate, 
isolated, or in a cell population (such as, in vivo, in vitro, organ explants, etc.). For instance, 
if the cell population is an in vitro cell culhire, the agent can be contacted with the cells by 
adding it directiy into the culture medium. If tiie agent cannot dissolve readily in an aqueous 
medium, it can be incorporated into liposomes, or anoflier lipophilic carrier, and then 
administered to the cell culture. Contact can also be facilitated by incorporation of agent 
with earners and delivery molecules and complexes, by injection, by infosion, etc. 

Agents can be directed to, or targeted to, any part of the polypeptide whidi is 
effective for modulating it For example, agents, such as antibodies and small molecules, can 
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be targeted to cell-surface, exposed, extracellular, ligand binding, functional, etc., domains of 
the polypeptide. Agents can also be directed to intracellular regions and domains, e.g., 
regions where the polypeptide couples or interacts with intracellular or intramembrane 
binding partners. 

5 After the agent has been administered in such a way that it can ^in access to 

differentially regulated cancer genes, it can be determined whether the test agent modulates 
differentially regulated cancer genes expression or biological activity. Modulation can be of 
any type, quality, or quantity, e.g., increase, facilitate, enhance, up-regulate, stimulate, 
activate, amplify, augment, induce, decrease, down-regulate, diminish, lessen, reduce, etc, 

10 The modulatory quantity can also encompass any value, e.g., 1%, 5%, 10%, 50%, 75%, 1- 
fold, 2-fold, 5-fold, 10-fold, 100-fold, etc. To modulate differentially regulated cancer genes 
expression means, e.g., that the test agent has an effect on its expression, e.g., to effect the 
amount of transcription, to effect RNA splicing, to effect translation of the RNA into 
polypeptide, to effect RNA or polypeptide stability, to effect polyadenylation or other 

15 processing of the RNA, to effect post-transcriptional or post-translational processing, etc. To 
modulate biological activity means, e.g., that a functional activity of the polypeptide is 
changed in comparison to its normal activity in the absence of the agent This effect 
includes, increase, decrease, block, inhibit, enhance, etc 

A test agent can be of any molecular composition, e.g., chemical compounds, 

20 biomolecules, such as polypeptides, lipids, nucleic acids, carbohydrates, antibodies, 
ribozymes, double-stranded RNA, aptamers, etc. For example, if a polypeptide to be 
modulated is a cell-surface molecule, a test agent can be an antibody that specificaUy 
recognizes it and, e.g., causes the polypeptide to be internalized, leading to its down 
regulation on the surface of the cell. Such an effect does not have to be permanent, but can 

25 require flie presence of the antibody to continue the down-regulatory effect. Antibodies can 
also be used to modulate the biological activity of a polypeptide in a lysate or other cell-free 
form. 

Therapeutics 

30 Selective polynucleotides, polypeptides, and specific-binding partners thereto, can be 

utilized in therapeutic appUcations, especially to treat prostate and breast cancers. Useful 
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mefliods include, but are not limited to, immunotherapy (e.g., using specific-binding partners 
to polypeptides), vaccination (e.g., using a selective polypeptide or a naked DNA encoding 
sudi polypeptide), protdn or polypeptide replacement therapy, gene therapy (e.g., germ-line 
correction, antisense), etc. 

Various immunotherj^eutic approadies can be used. For instance, unlabeled 
antibody that specifically recognizes a tissue-spedfic anti^ can be used to stimulate the 
body to destroy or attack a cancer or otha- diseased tissue, to cause down-regulation, to 
produce complement-mediated lysis, to inhibit cell growth, etc., of target cells which display 
the antigen, e.g., analogously to how c-erbB-2 antibodies are used to treat breast cancer. In 
addition, antibody can be labeled or conjugated to enhance its deleterious effect, e.g., with 
radionuclides and otha: enacgy emitting entitities, toxins, such as ridn, exotoxin A (ETA), 
and diphtiieria, cytotoxic or (^static agents, immunomodulators, dhemo&erapeutic ageats, 
ete. See, e.g., U.S. Pat No. 6,107,090. 

An antibody or other spedfic-binding partner can be conjugated to a second molecule, 
such as a (xytotoxic agent, and used foe targeting the second molecule to a tissue-antigen 
positive cell (Vitetta, E. S. et al., 1993, Immunotoxin therapy, in DeVita, Jr., V. T. et al., eds. 
Cancer: Principles and Practice of Oncology, 4th ed., J. B. Lippinoott Co., Philadelphia, 
2624-2636). Examples of cytotoxic agraits include, but are not linndted to, antimetabolites, 
alltylating agents, anfhracyclines, antibiotics, anti-mitotic agents, radioisotopes and 
chemotherz^utic agents. Furfhar examples of cytotoxic agents include, but are not limited to 
ridn, doxorubicin, daunorubicin, taxol, efhidium bromide, mitomycin, etoposide, toioposide, 
vincristine, vinblastine, colchicine, dihydroxy anfhradn dione, actinomydn D, 1- 
ddiydrotestosterone, dipthma toxin, Pseudomonas exotoxin (PB) A, PE40, aibrm, elongation 
fiictor-2 and glucocorticoid. Techniques for conjugating therapeutic agents to antibodies are 
well. 

hi addition to immunotherapy, polynucleotides and polypeptides can be used as 
targets for non-immxmotherapeutic applications, e.g., using compounds which interfere with 
function, expression (e.g., antisense as a therapeutic agent), assembly, etc. RNA interference 
can be used in vitro and in vivo to silence differentially regulated cancer genes when its 
expression contributes to a disease (but also for other purposes, e.g., to identify the gene's 
function to diange a developmental pathway of a cell, etc.). See, e.g., Sharp and Zamore, 
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Science, 287:2431-2433, 2001; Grishok et al.. Science, 287:2494, 2001. 

Delivery of flierapeutic agents can be achieved according to any effective method, 
including, liposomes, viruses, plasmid vectors, bacterial delivery systems, orally, 
systemically, etc. Therapeutic agents of the present invention can be administered in any 
form by any effective route, including, e.g., oral, parenteral, enteral, intraperitoneal, topical, 
transdermal (e.g., using any standard patch), intravmously, ophthalmic, nasally, local, non- 
oral, such as aerosal, inhalation, subcutaneous, intramuscular, buccal, sublingual, rectal, 
vaginal, intra-arterial, and intrathecal, etc. They can be administered alone, or m 
combination with any ingredient(s), active or inactive. 

In addition to thenq^eutics, ^p^^e, the present invention also relates to methods of 
treating a canc^ showing altered expression of differentially regulated cancer genes, 
comprising, e.g., administering to a subject in need thereof a therapeutic agent which is 
effective for regulating expression of said differentially regulated cancer genes and/or which 
is effective in treating said disease. The term ^treating*' is used conventionally, e.g., the 
management or care of a subject for the piirpose of combating, alleviating, reducing, 
relieving, improving the cancer. By the phrase "altered expression," it is meant diat the 
disease is associated with a mutation in the gene, or any modification to the gene (or 
corresponding product) which affects its normal function. Thus, expression of a 
differentially regulated cancer gene refeas to, e.g., transcription, translation, spUdng, stability 
of the mRNA or protein product, activity of the gene product, differential expression, etc. 

Any agent whidb **treats" the disease can be used. Such an agent can be one which 
regulates the expression of the differentially regulated cancer genes. Expression refers to the 
same acts already mentioned, e.g. transaription, translation, splicing, stability of the mRNA 
or protein product, activity of the gene product, differential expression, etc. For instance, if 
the condition was a result of a complete deficiency of the gene product, administration of 
gene product to a patient would be said to treat the disease and regulate the gene's 
expression. Many ottier possible situations are possible, e.g., where the gene is aberrantly 
expressed, and the therapeutic agent regulates the aberrant expression by restoring its normal 
expression pattern. 



Arrays 
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The present invention also relates to an ordered array of polynucleotide probes and 
specific-binding partners (e.g., antibodies) for detecting the expression of differentially 
regulated cancer genes in a sample, comprising, one or more polynucleotide probes or 
specific binding partners associated with a solid support or in separate receptacles, wherein 

5 each probe is specific for differCTitially regulated cancer genes, and the probes comprise a 
nucleotide sequence of SEQ ID NOS 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 
47, 50, 48, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and/or 99 which is 
specific for said gene, a nucleotide sequence having sequence identity to SEQ ID NOS 1,3, 
5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 64, 66, 68, 

10 70, 72, 74, 84, 85, 86, 88, 95, and/or 99 which is specific for said gene or polynucleotide, or 
complements thereto, or a specific-binding partner which is specific for differentially 
regulated cancer genes. 

The phrase "ordered array" indicates tjiat die probes are arranged in an identifiable or 
position-addressable pattern, e.g., such as the arrays disclosed in U.S. Pat Nos. 6,156,501, 

15 6,077,673, 6,054 ,270, 5,723,320, 5,700,637, WO0991971 1, WO00023803. The probes are 
associated with the solid support in any effective way. For instance, the probes can be boxmd 
to the solid support, either by polymerizing the probes on the substrate, or by attaching a 
probe to the substrate. Association can be, covalent, electrostatic, noncovalent, hydrophobic, 
hydrophilic, noncovalent, coordination, adsorbed, absorbed, polar, etc. When fibers or 

20 hollow filamente are utilized for the array, flie probes can fill the hollow orifice, be absorbed 
into the solid filament, be attached to the surface of the orifice, etc. Probes can be of any 
effective size, sequence identity, composition, etc., as akeady discussed. 

Transgenic animals 

25 The present invention also relates to transgenic animals comprising differentially 

regulated cancer genes genes, and homologs thereof (Methods of making transgenic 
animals, and associated recombinant technology, can be accomplished conventionally, e.g., 
as described in Transgenic Animal Technology, Pinkert et al., 2"^ Edition, Academic Press, 
2002.) Such genes, as discussed in more detail below, include, but are not limited to, 

30 fiinctionally-disrapted genes, mutated genes, ectopically or selectively-expressed genes, 

inducible or regulatable genes, etc. These transgenic animals can be produced according to 
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any suitable technique or method, including homologous recombination, mutagenesis (e.g., 
ENU, Rathkolb et al., Exp. PhysioL, 85(6):635-644, 2000), and the tetracycline-regulated 
gene expression sfystem (e.g., U.S. Pat No. 6,242,667). The term "gene*' as used herein 
includes any part of a gene, i.e., regulatory sequences, promoters, enhancers, exons, introns, 
S coding sequences, etc. The differentially regulated cancer genes nucleic add presCTt in the 
construct or transgene can be naturally-occurring wild-type, polymorphic, or mutated. Where 
the animal is a non-human animal, its homolog can be used instead. Transgenic animals can 
be siisceptible to cancer, e.g., prostate or breast cancer. 

Along these lines, polynucleotides of the present invention can be used to create 

10 transgenic animals, e.g. a non-human animal, comprising at least one cell whose genome 
comprises a functional disraption of a differentially regulated cancer gene, or a homolog 
thereof (e.g., a mouse homolog when a mouse is used). By the phrases 'functional 
disraptiorf' or 'functionally disrapted," it is meant that the gene does not express a 
biologically-active product. It can be substantially deficient in at least one functional activity 

1 5 coded for by the gene. Expression of a polypeptide can be substantially absent, i.e., 

essentially undetectable amounts are made. However, polypeptide can also be made, but 
which is deficient in activity, e.g., where only an amino-terminal portion of the gene product 
is produced. 

The transgenic animal can comprise one or more cells. When substantially all its 
20 cells contain the engmeered gene, it can be referred to as a transgenic animal '^vhose genome 
comprises" the engineered gene. This indicates tiiat the endogenous gene loci of the animal 
has been modified and substantially all cells contain such modification. 

Functional disruption of the gene can be accomplished in any effective way, 
including, e.g., introduction of a stop codon into any part of the coding sequence such that the 
25 resulting polypeptide is biologically inactive (e.g., because it lacks a catalytic domain, a 

ligand binding domain, etc.), introduction of a mutation into a promoter or other regulatory 
sequence that is effective to turn it off, or reduce transcription of the gene, insertion of an 
exogenous sequence into the gene which inactivates it (e.g., which disrupts the production of 
a biologically-active polypeptide or which disrupts the promoter or other transcriptional 
30 machinery), deletion of sequences fibom the differentially regulated cancer genes gene (or 
homolog thereof), etc. Examples of transgenic animals having functionally disrupted genes 
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are weU known, e.g., as described in U.S. Pat Nos. 6,239,326, 6,225,525, 6,207,878, 
6,194,633, 6,187,992, 6,180,849, 6,177,610, 6,100,445, 6,087,555, 6,080,910, 6,069,297, 
6,060,642, 6,028,244, 6,013,858, 5,981,830, 5,866,760, 5,859,314, 5,850,004, 5,817,912, 
5,789,654, 5,777,195, and 5,569,824. A transgraic animal which comprises tiie fbnctional 
5 disnq>tion can also be refened to as a *1mock-oiif' animal, since the biological activity of its 
dififerentially regulated cancer genes genes has been **knocked-oiit" Knockouts can be 
homozygous or hetero2ygous. 

For creating functionally disrupted genes, and other gene mutations, homologous 
recombination technology is of special interest since it allows specific regions of the genome 
10 to be targeted. Using homologous recombination methods, genes can be specifically- 
inactivated, specific mutations can be introduced, and exogenous sequences can be 
introduced at specific sites. These methods are well known in the art, e.g., as described in the 
patents above. See, also, Robertson, BioL Reproduc, 44(2):238-245, 1991. Generally, the 
genetic engineering is performed in an embryonic stem (ES) cell, or other pluripotent ceU line 
1 5 (e.g., adult stem cells, EG cells), and that genetically-modified cell (or nucleus) is used to 
create a whole organism. Nuclear transfer can be used in combination with homologous 
recombination technologies. 

For example, the dififerentially regulated cancer genes locus can be disrupted in 
mouse ES cells using a positive-negative selection method (e.g., Mansour et al,. Nature, 
20 336:348-352, 1988). In this method, a targeting vector can be constructed which comprises a 
part of the gene to be targeted. A selectable marka:, such as neomycin resistance genes, can 
be inserted into a differentially regulated cancer genes exon present in the targeting vector, 
disrapting it When the vector recombines with the ES cell genome, it dismpts the fimction 
of the gene. The presence in the cell of the vector can be determined by expression of 
25 neomycin resistance. See, e.g., U.S. Pat. No. 6,239,326. Cells having at least one 

fimctionally disrupted gene can be used to make chimeric and germline animals, e.g., animals 
having somatic and/or germ cells comprising the engineered gene. Homozygous knock-out 
animals can be obtained fix)m breeding hetCTOzygous knock-out animals. See, e.g., U.S. Pat. 
No. 6,225,525. 

30 The present invention also relates to non-human, transgenic animal whose genome 

comprises recombinant differentially regulated cancer nucleic acid (and homologs thereof) 
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operatively linked to an expression control sequence effective to express said coding 
sequence, e.g., in prostate and/or breast tissues. Such a transgenic animal can also be 
referred to as a **knock-in" animal since an exogenous gene has been introduced, stably, into 
its genome. 

5 A recombinant differCTitially regulated cancer genes nucleic acid refers to a 

polynucleotide which has been introduced into a target host cell and optionally modified, 
such as cells derived from animals, plants, bacteria, yeast, etc. A recombinant differentially 
regulated canc^ genes includes completely synthetic nucleic acid sequences, semi-synthetic 
nucleic add sequences, sequences derived from natural sources, and chimeras thereof. 

1 0 ^'Operable linkage" has the meaning used through die specification, i.e., placed in a 

fimctional relationship with another nucleic acid. When a gene is operably linked to an 
expression control sequence, as explained above, it indicates that the gene (e.g., coding 
sequence) is joined to the expression control sequence (e.g., promoter) in such a way that 
facilitates transcription and translation of the coding sequence. As described above, the 

1 5 phrase "genome" indicates that the genome of the cell has been modified. In this case, the 
recombinant differentially regulated cancer genes has been stably integrated into the genome 
of the aninaal. The differentially regulated cancer genes nucleic acid (e.g., a coding 
sequence) in operable linkage with the expression control sequence can also be referred to as 
a construct or transgene. 

20 The present invention also relates to a transgenic animal which contains a fimctionally 

disrupted and a transgene stably integrated into the animals genome. Such an animal can be 
constructed using combinations any of the above- and below-mentioned methods. Such 
anitn als have any of the aforementioned uses, including permitting the knock-out of the 
normal gene and its replacement with a mutated gene. Such a transgene can be integrated at 

25 the endogenous gene locus so that the functional dismption and **knock-in" are carried out in 
the same step. 

hi addition to the methods mentioned above, transgenic animals can be prepared 
according to known methods, including, e.g., by pronuclear injection of recombinant genes 
into pronuclei of 1-cell embryos, incorporating an artificial yeast chromosome into 
30 embryonic stem cells, gene targeting methods, embryonic stem cell methodology, cloning 
methods, nuclear transfer methods. See, also, e.g., U.S. Patent Nos. 4,736,866; 4,873,191; 
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4,873,316; 5,082,779; 5,304,489; 5,174,986; 5,175^84; 5,175,385; 5,221,778; Gordon et al., 
Proc. Natl. Acad. Sci., 77:7380-7384, 1980; Palmiter et al.. Cell, 41:343-345, 1985; Palmiter 
et al., Ann. Rev. G&aet., 20:465-499, 1986; Askew et al., MoL Cell. Bio., 13:4115-4124, 
1993; Games et al. Nature, 373:523-527, 1995; Valancius and Smithies, Mol. Cell. Bio., 

5 1 1:1402-1408, 1991; Stacey et al., Mol. Cell. Bio., 14:1009-1016, 1994; Hasty et al.. Nature, 
350:243-246, 1995; Rubinstein et al., Nucl. Add Res., 21:2613-2617,1993; abelli et al.. 
Science, 280:1256-1258, 1998. For guidance on recombinase excision systems, see, e.g., 
U.S. Pat Nos. 5,626,159, 5,527,695, and 5,434,066. See also, Qrban, P.C., et al., *Tissue- 
and Site-Specific DNA Recombination in Transgenic Mice," Proc. Natl. Acad. Sci. USA, 

10 89:6861-6865 (1992); O'Goiman, S., et al., "Recombinase-Mediated Gene Activation and 
Site-Specific Litegtation in Mammalian Cells," Science, 251:1351-1355 (1991); Sauer, B., et 
al., "Cre-stimxilated recombination at loxP-Containing DNA sequences placed into flie 
mammalian ^ome," Polynucleotides Research, 17(1):147-161 (1989); Gagneten, S. et al. 
(1997) Nucl. Adds Res. 25:3326-3331; Xiao and Weaver (1997) Nucl. Adds Res. 25:2985- 

15 2991; Agah, R. et al. (1997) J. Clin. Invest 100:169-179; Barlow, C. et al. (1997) Nucl. 

Adds Res. 25:2543-2545; Araki, K. et al. (1997) Nucl. Adds Res. 25:868-872; Mortensen, 
R. N. et al. (1992) Mol. CeU. Biol. 12:2391-2395 (G418 escalation method); Lakhlani, P. P. 
et al. (1997) Proc. Natl. Acad. Sci. USA 94:9950-9955 ("hit and run"); Westphal and Leder 
(1997) Curr. Biol. 7:530-533 (transposon-generated •'knock-out" and 'Imock-in"); 

20 Tonpleton, N. S. et al. (1997) Gene Ther. 4:700-709 (mefliods for effident gene targeting, 
allowing for a high fireqoency of homologous recombination events, e.g., witiiout selectable 
markers); PCT International Publication WO 93/22443 (fimctionally-disrupted). 

A polynucleotide according to the preset invention can be introduced into any 
non-human animal, including a non-human mammal, mouse (Ho^m et al., ^fenipulating the 

25 Motise Embrvo: A Laboratory Manual. Cold Sming Harbor Laboratory. Cold Spring Harbor, 
New York, 1986), pig (Hammer et al.. Nature, 315:343-345, 1985), sheep (Hammer et al.. 
Nature, 315:343-345, 1985), cattle, rat, or primate. See also, e.g.. Church, 1987, Trends in 
Biotech. 5:13-19; Clark et al.. Trends in Biotech. 5:20-24, 1987); and DePamphilis et al.. 
BioTechniques, 6:662-680, 1988. Transgenic animals can be produced by tiie methods 

30 described in U.S. Pat No. 5,994,618, and utilized for any of the utilities desoibed therein. 
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Database 

The presoit invention also relates to electronic forms of polynucleotides, 
polypeptides, etc., of the present invention, including computer-readable medium (e.g., 
magnetic, optical, etc., stored in any suitable format, such as flat files or hierarchical files) 
5 which comprise such sequences, or fi^gments tiiereof, e-commerce-related means, etc. 

Along these lines, the present invention relates to methods of retrieving gene sequences firom 
a computer-readable medium, comprising, one or more of the following steps in any effective 
order, e.g., selecting a cell or gene expression profile, e.g., a profile that specifies that said 
gene is differentially expressed in breast and/or prostate cancer, and retrieving said 

1 0 differentially expressed gene sequences, where the g&ie sequences consist of the genes 
represented by SEQ ID NOS 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 
48, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and/or 99. 

A ^^geue expression profile" means the list of tissues, cells, etc., in which a defined 
gene is expressed (i.e, transcribed and/or translated). A "cell expression profile" means the 

1 5 genes which are expressed in the particular cell type. The profile can be a list of the tissues 
in which the gene is expressed, but can include additional information as well, including 
level of expression (e.g., a quantity as compared or normalized to a control gene), and 
information on temporal (e.g., at what point in the cell-cycle or developmental program) and 
spatial expression. By the phrase "selecting a gene or cell expression profile," it is meant that 

20 a user decides what type of gene or cell expression pattern he is interested in retrieving. Any 
pattern of expression prefereuces may be selected. The selecting can be performed by any 
effective method. In general, "selecting" refers to the process in which a user forms a query 
that is used to search a database of gene expression profiles. The step of retrieving involves 
searching for results in a database that correspond to the query set forth in the selecting step. 

25 Any suitable algorithm can be utilized to perform the search query, including algorithms that 
look for matches, or that perform optimization between query and data. The database is 
information that has been stored in an appropriate storage medium, having a suitable 
computer-readable format Once results are retrieved, they can be displayed in any suitable 
format, such as HTML. 

30 For instance, the user may be interested in identifying genes that are differentially 

expressed in cancer. He may not care whether small amounts of expression occur in other 



wo 03/064599 




PCT/US03/01943 



-60- 



tissues, as long as such genes are not expr^sed in peripheral blood Ijonphocytes. A query is 
formed by the user to retrieve the set of genes from the database having the desired graie or 
cell expression profile. Once the query is inputted into the system, a search algorithm is used 
to interrogate the database, and retrieve results. 

Advertising, licensing, etc., methods 

The present invention also relates to methods of advertising, licensing, selling, 
purchasing, brokering, etc., genes, polynucleotides, specific-binding partners, antibodies, etc., 
of the present invention. Methods can comprises, e.g., displaying a differentially regulated 
cancer genes g«ie, differentially regulated cancer genes polypeptide, or antibody specific for 
differentially regulated cancer genes in a printed or computer-readable medium (e.g., on the 
Web or Internet), accq[>ting an offer to purchase said gene, polypeptide, or antibody. 

Other 

A polynucleotide, probe, polypeptide, antibody, specific-binding partner, etc., 
according to the present invention can be isolated. The term "isolated'' means that the 
material is in a form in which it is not found in its original environment or in nature, e.g., 
more concentrated, more purified, separated from component, etc. An isolated 
polynucleotide includes, e.g., a polynucleotide having the sequenced separated fix)m the 
chromosomal DNA found in a living animal, e.g., as the complete gene, a transcript, or a 
cDNA. This polynucleotide can be part of a vector or inserted into a chromosome (by 
specific gene-targeting or by random integration at a position other than its normal position) 
and still be isolated in tiiat it is not in a form that is found in its natural environment. A 
polynucleotide, polypeptide, etc., of the present invention can also be substantially purified. 
By substantially purified, it is meant that polynucleotide or polypeptide is separated and is 
essentially free from other polynucleotides or polypeptides, i.e., the polynucleotide or 
polypeptide is the primary and active constituent. A polynucleotide can also be a 
recombinant molecule. By '^recombinant," it is meant that the polynucleotide is an 
arrangement or form which does not occur in nature. For instance, a recombinant molecule 
comprising a promoter sequence would not encompass the naturally-occurring gene, but 
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wotild include the promoter operably linked to a coding sequence not associated with it in 
nature, e.g., a reporter gene, or a truncation of the normal coding sequence. 

The temi ^^arker" is used herein to indicate a means for detecting or labeling a 
target A maiker can be a polynucleotide (usually referred to as a **probe"), polypeptide (e.g., 
5 an antibody conju^ted to a detectable label), PNA, or any effective material. 

The topic headings set forth above are meant as guidance where certain information 
can be found in the application, but are not intended to be the only source in the application 
where information on such topic can be foimd Reference materials 

For other aspects of the polynucleotides, reference is made to standard textbooks of 

10 molecular biology. See, e.g., Hames et al.. Polynucleotide Hvbridization. IL Press, 1985; 
Davis et al., Basic Methods in Molecular Biology. Elsevir Sciences Publishing, Inc., New 
York, 1986; Sambrook et al.. Molecular ninninp r ^ CSH Press, 1989; Howe, Gene Cloning 
and Manipulation. Cambridge University Press, 1995; Ausubel et al.. Current Protocols in 
Molecular Biology, John Wiley & Sons, Inc., 1994-1998. 

15 Without further elaboration, it is believed that one skilled in the art can, using the 

preceding description, utilize the present invention to its fullest extent. The following 
preferred specific embodiments are, therefore, to be construed as merely illustrative, and not 
limitative of the remainder of the disclosure in any way whatsoever. The entire disclosure of 
all applications, patents and publications, cited above and in the figures are hereby 

20 incorporated by refermce in their entirety. 
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Table 1 



Clone D> Protem-L"^ Location 



Domains Names 



10 



20 



25 



30 



35 



40 



5 LPcp0749-2z 1564aa 



Nuclear 



2. Pcp0389Az 694aa 



15 3.Pci)0814z 279aa 



Extracellular 



Membrane 



4. Pcp0623 2697aa 

5. PcpOSlS 1041aa 



6. Pcp084Qz 243aa 



7. Pcp0424Az 789aa 



7. Pcp0424Bz 763aa 



45 7. Pcp0424Cz 752aa 



Nuclear 
Nuclear 



50 



8. Pc0382 1584aa 



Nuclear 



Cytoplasm 



Cytoplasm 



Cytoplasm 



Nuclear 



1. ZnF__C3Hl domain: 36-63aa; 

2. Caldesmon domain: 423-1027aa; 

3. Coiled coil: 162-197a^ 

4. Coiled coil: 645-789aa; 

5. Coiled coil: 1339-1366aa. 

L Coiled coil: 17-5 laa; 
2. Methyl'-acceptiiig chemotaxis-like domain 
(MA):101-200aa. 

1. Transmembrane domain: 83-1 05 aa; 

2. Transmembrane domain: 120-142a2^ 

3. Transmembrane domain: 192-21 laa; 

4. MotA/TolQ/ExbB proton channel domain: 34- 
156aa; 

5. Coiled coil: 226-253aa. 

L Caldesmon domain: 590-884aa. 

L ZF_C2H2: 371-393aa; 

2. ZF_C2H2: 399-421aa; 

3. ZF_C2H2:621-651aa; 

4. ZF_C2H2: 657-679aa; 

5. ZF_C2H2: 689-71 laa; 

6. ZF_C2H2: 909-931aa; 
1. ZF_C2H2: 938-961aa; 

8. PP_M1 Phosphoprotein domain: 754-923aa. 

1. 2nF_C2H2 domains: 12-37aa; 

2. ZdF_C2H2 domains: 173-198aa; 
3. 2hF_C2H2 domains: 208-230aa. 

1. Dynamin, large GTPase domain: 54-308aa. (GTP- 
binding); 

2. Dynamin GTPase effector domain: 692-783aa. 

1 . Dynamin, large GTPase domain: 54-308aa. (GTP- 
binding); 

2. Dynamin GTPase efEector domain: 692783aa; 

1. Dynamin, large GTPase domain: 54-308aa. (GTP- 
binding); 

2. Dynamin GTPase effector domain: 692-783aa. 

1. SAM domain: ll-78aa; 

2. Kinesin domain: 1079-1 103aa. 
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Clone ID 


Protein-L* 


Location 


Domains Names 


5 


9.Pcp0816 


lOlBaa 


Membrane 


1. Signal peptide: l-38aa; 

2. £GF-like domain: 274-308aa; 

3. Transmembrane domain: 908-930aa. 




10.Pcp0480 


171aa 


Nuclear 


1. Estrogen receptor l-169aa. 


10 


ll.Pcp0842x 


222 laa 


Cytoplasm 


1. SET7 domain: 630-842aa 



L* stands for protein length in amino adds 

15 
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TABLE2 



Clone ID Locus 



Associated diseases 



10 



15 



20 



25 



30 



35 



40 



45 



50 



5 l.Pcp0749-2z 13ql4.11 



2. P<^0389Az 6q22.33 



3. Pq)0814z 12pl3.3 

4. Pcp0623 5pl3.2 

5. Pcp0815 Hqll.l 

6. Pq)0840z 7pl5.1 

7. Pcp0424Az 12pl2.3 

7. Pcp0424Bz 12pl23 

7. Pcp0424Cz 12pl2.3 

8. Pc0382 7q21.3 

9. Pcp0816 lpl3.1 

10. Pcp0480 6q25.1 

11. Pcp0842x 6q233 



1. Rieger syndrome type 2 at 13ql4; 

2. Low grade B-cell malignancy at 13ql4. 

1. IgA Nephropathy at 6q22-q23; 

2. Autosomal recessive craniometaphyseai dysplasia at 6q21-q22; 

3. Heterocelhilar hereditary persist^ice of fetal hemoglobin at 6q22.3- 
q23.1; 

4. Oculodentodigital dysplasia at 6q22-q24; 

5. SusceptibiUty to severe hepatic fibrosis due to Schistosoma mansoni 
infection at 6q22-q23. 

1 . Chromosomal abnormalities associated with breast and 
ovary cancer. 



1 . Respiratory allergies (asthma) at 14ql 1 . 1 



1 . Alzheimer disease &milial type 5 at 12p 1 1 .23-ql3. 1 2; 

2. Fibrosis of extraocular muscles congenital lat 12pl 1.2-'ql2; 

3. Hypertension widi brachydactyly at 12pl2.2-pl 1 ^. 

same as Bcu0424Az. 
same as Bcu0424Az. 

1. Split-hand/foot malformation type-l(SHFMl) at 7q21.2-q21.3; 

2. SHFM with sensorineural hearing loss (SHFMID at 7q2 1.2-2 1.3.); 

4. Malignant hyperthermia susceptibility 3 at 7q2 1^22; 

5. Myoclonic dystonia-ll at 7q21. 

1 . Vesicoureteral reflux (VUR) at Ip 13; 

2. Trisomy and Monosomy at lpl3 cause cancers in prostate, ovary 
and breast 

1. Estrogen recqptor-1 at 6q25.1 (Alternative isofonns are related to 
breast cancer); 

2. Schizophrenia-5 at 6q26-ql3; 

3. Insulin-dependeat diabetes mellitus-8 at 6q25-q27. 

1 . DUated cardiomyopathy 1 J (CMDl J) at 6q23-q24; 

2. Dilated cardiomyopadiy IF (CMDIF) at 6q23; 

3. Oculodentodigital dysplasia (ODDD) at 6q22-q24; 

4. Susceptibility to severe hepatic fibrosis due to Sclustosoma mansoni 
infection at 6q22-q23; 

5. IgA nephropathy at 6q22-q23. 



55 
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TABLE3 



Gene 
Name 



SEQIDNO 



Expression 



PCP0749 
PCP0389 
PCP0814 
PCP0623 
POPOSIS 
PC3P0840 
PCP0424 
PC0382 
PCP0816 
PCP0480 

PCP0842 



1-4 

5-8 

9-10 

11-12 

13-14 

15-16 

17-18 (A); 19-20 (B); 
21-22 (C) 
23-24 

25-26 

27-28 

29-30 



UP 

UP 
DOWN 
DOWN 

UP 
DOWN 

UP 

UP 

UP 

UP 

UP 



normal expression 
restricted to muscle 
and uterus 
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TABLE 4 



GENE 

1. PCT04O5 
(SEQ ID NO 45-46) 



EXPRESSION 
DOWN 



LENGTH 
1379 



2. PCP0454A DOWN 
(SEQ ID NO 50-51) 



PCP0454B DOWN 
(SEQ ID NO 48-49) 

3.PCP0459 UP 
(SEQ ID NO 52-53) 



4. PC0177A UP 
(SEQ ID NO 54-55) 



PCMITTB UP 
(SEQ ID NO 56-57) 



PC0177C UP 
(SEQ ID NO 58-59) 



PC0177D UP 
(SEQ ID NO 60-61) 

5.PCP0557 UP 
(SEQ ID NO 62-63) 



6. PCP0664 UP 
(SEQ ID NO 64-65) 

7. PCP0677 UP 
(SEQ ID NO 66-67) 



3863 

577 
715 

1744 

1709 

1908 

1309 
1593 

112 
89 



DOMAINS 

1. CUB domain: 93-209aa; 

2. DSL domain: 222-280aa; 

3. Kelch domain: 480-531aa; 

4. Kelch domain: 532-591aa; 

5. PSI domain: 614-657aa; 

6. PSI domain: 666-709aa; 

7. CLECT domain: 748-873aa; 

8. PSI domain: 889-939aa; 

9. PSI domain: 942-1012aa; 

10. EGF-like domain: 1014-1057aa; 

11. EGF-like domain: 1060-1 106aa; 

12. Transmembrane domain: 1230-1252aa. 

1. Internal repeat 2: 19-72; 

2. Mtemal repeat 1:71-135; 

3. Internal repeat 2: 332-385; 

4. Internal repeat 1:488-554. 

1. Internal repeat 1: 1-137; 

2. AAA domain: 241-408. 

1. Gag plO domain: 1-89; 

2. Gag p24 domain: 360-573; 

3. ZaiF C2HC domain: 592-608; 

4. ZdF C2HC domain: 629-645. 

1. Coiled coik 646-685; 

2. Coaedcoil: 1469-1481; 

3. Coiled coil: 1656-1684. 

1. Coiled coil: 611-650; 

2. Coiled coil: 1434-1456; 

3. Coiled coil: 1621-1649. 

1. Coiled coil: 611-650; 

2. Coiled coil: 1434-1456; 

3. Coiled coiL 1621-1649. 

1. CoUed coil: 611-650 



1. HisKA: 565-620; 

2. Coiled coil: 933-965; 

3. CoUedcoil: 1464-1491. 

1. Transmenibrane: 4-26. 



No domain found. 
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8. PCP0762 UP 
(SEQ ID NO 68-69) 

9. PCP0806 UP 
(SEQ ID NO 70-71) 



221 



548 



1. SCAN domain 42-137 



1. SCOP domain: 10-122 

2. Coiled coU: 374-409 



10. PCP0815A 
(SEQ ID NO 72-73) 



UP 



1005 1 . ZnF C2H2 domain: 371-393; 

2. ZnF C2H2 domain: 399-421; 

3. ZnF C2H2 domain: 629-651; 

4. ZnF C2H2 domain: 657-679; 

5. ZnF C2H2 domain: 689-71 1; 

6. ZnF C2H2 domain: 909-931; 

7. ZnF C2H2 domain: 9380961. 



PCP0815C UP 
(SEQ ID NO 74-75) 



198 



No domain fomid 
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Clone ID 

PCP0405 
PCP0454 

PCP0459 

PC0177 

PCP0577 



PCP0664 



PCP0677 

PCP0762 
PCP0806 

PCP0815 



Locus 

10q26 
6ql5 

22qll.21 
lOpll.22 
X<i25-q26.3 



Diseases 



Xq25-Xq26 



12ql5 

18ql2.1 
2q37.3 

14qll.l-ql2 



Cancers 



Amaurosis Congenita Of Leber V; 

Cardiomyopathy, Dilated, Ik (Cmdlk); 

Chorioretinal Atrophy, Progressive Bifocal; 

Macolar Dystrophy, Retinal, 1, North Carolina Type (Mcdrl) 

Cancers 
Diabetes 

Mental Retardation, X-Linked, With Short Stature, Small Testes, 
Muscle Wasting, And Tremor, 
Hypertrichosis, Congenital Generalized (Htc2); 
Borjeson-Forssman-Lehmann Syndrome (Bfls); 
Mental Retardation 

Mental Retardation, X-Linked, With Isolated Growth Hormone 
Deficiency (Mrgh) 

Hypertrichosis, Cong^tal Generalized (Htc2); 

Mental Retardation, X-Linked, South African Type; 

Borjeson-Forssman-Lehmann Syndrome (Bfls); 

Mental Retardation With Optic Atrophy, Deafiiess, And Seizures; 

Mental Retardation 

Scapuloperoneal Myopathy (SPM); 
Cancers (e.g., glioma) 

Cancers 

Gracile Syndrome; 
Holoprosencephaly 6 

AnfayttmLogenic Ri^t Ventricular Dysplasia, Familial, 3 (Arvd3); 
Radiation Sensitivity/Chromosome Instability Syndrome, 
Autosomal Dominant; 
Asthma 



i 
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Claims: 

1. An isolated polyaucleotide comprising, 

a polynucleotide sequence which codes without interruption for the amino acid 
sequence selected from SEQ ID NOS 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 46, 
5 51, 49, 53, 55, 57, 59, 61,63, 65, 67, 69, 71, 73, 75, 87, 89, 96, and 100, or a complement 
thereto. 

2. An isolated polynucleotide comprising, 

a polynucleotide sequence having 95% or more sequence identity along the entire 
10 lengOi of a polynucleotide sequence selected from SEQ ID NOS 1, 3, 5, 7, 9, 11, 13, 15, 17, 
19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 

88. 95, aad 99, or a complement thereto. 

3. An isolated polynucleotide comprisiag, 

15 a differentially regulated mammalian cancer gene having a polynucleotide sequence 

which hybridizes under high stringency conditions to a polynucleotide having a 
polynucleotide sequence set forth in SEQ ID NOS 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 
27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and 99, 
or a complement thereto. 

20 

4. An isolated polynucleotide of claim 3, wherein said high stringency conditions comprise 
hybridizing 42*^0 in 5X SSPE, 0.3% SDS, and 50% formamide, and washes at 65°C for 15 
minutes in 2X SSC, and 0.2% SDS. 

25 5. An isolated hxraian polypeptide which is differentially regulated in a hiraian cancer, 

comprismg the complete amino add sequence as set forth in SEQ ID NOS 2, 4, 6, 8, 10, 12, 
14, 16, 18, 20, 22, 24, 26, 28, 30, 46, 51, 49, 53, 55, 57, 59, 61,63, 65, 67, 69, 71, 73, 75, 87, 

89. 96, or 100, or a specific fragment thereof. 

30 6. An isolated manoonalian polypeptide which is differentially regulated in a mamcmalian 
cancer, having 95% or more sequence identity along its entire length to a complete amino 
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add sequence as set forth in SEQ ID NOS 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 
46, 51, 49, 53, 55, 57, 59, 61,63, 65, 67, 69, 71, 73, 75, 87, 89, 96, or 100, or a specific 
fragment thereof. 

5 7. An isolated polypeptide which is coded for by a polynucleotide of claim 2 or 4. 

8. A method of detecting a nucleic add coding for hiunan differentially regulated cancer 
genes, comprising, 

contacting a sample comprising nucldc acid with a polynucleotide probe specific for 
10 a differentially regulated human cancer gene of claim 1 under conditions effective for said 
probe to hybridize spedfically wifii said gene, and 

detecting hybridization between said probe and said nucldc add. 

9. A method of claim 8, wherein said detecting is performed by: 

1 5 Northem blot analysis, polymerase chain reaction (PGR), reverse transcriptase PGR, 

RAGE PGR, or in situ hybridization. 

10. A method of treating a cancer showing altered expression of differentially regulated 
human cancer gene; comprising: 

20 administering to a subject in need thereof a thersqpeutic agent whidi is effective for 

regulating expression of a diff^entially regulated human cancer gene of claim 1 . 

1 1. A method of claim 10, wherein said agent is an antibody or an antisense which is 
effective to inhibit translation of said gene. 

25 

12. A method of diagnosing a cancer associated with abnormal expression of a differentially 
regulated human cancer gene, or determining a subject's susceptibility to such cancer, 
comprising: 

assessing the expression of a differentially regulated human cancer gene of claim 1 in 
30 a tissue sample comprising prostate or breast cells. 
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ls. A method of claim 12, wherdxi assessing is: 

measuring expression levels of said gene, detomining the genonoic structure of said 
gene, determining the noGRNA structure of transcripts &om said gene, or measuring the 
expression levels of polypeptide coded for by said gene. 

5 

14. A method of claim 12, wherein said assessing is performed by: 

Northern blot analysis, polymerase chain reaction (PGR), reverse transcriptase PGR, 
RAGE PGR, or in situ hybridization, and 

usijQg a polynucleotide probe having a polynucleotide sequence selected from SEQ ID 
10 NOS 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 
64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and 99, or a complement thereto. 

1 5. A method of assessing a therapeutic or preventative intervention in a subject having 
cancer, comprising, 

15 determining the expression levels of differentially regulated cancer genes of claim 1 

in a tissue sample comprising prostate or breast cells. 

16. A method for identifying an agent that modulates the expression of differentially 
regulated cancer gene expressed in breast or prostate tissue cells, comprising, 

20 contacting a cell population comprising breast or prostate cells with a test agent under 

conditions effective for said test agent to modulate the expression of a differentially regulated 
cancer gene of claim 1 in said breast or prostate cells, and 

determining whether said test agent modulates said diffarentially regulated cancer 

gene. 

25 

17. A method for identifying an agent that modulates the expression of a polypeptide coded 
for by a differentially regulated cancer gene expressed in breast or prostate tissue cells, 
comprising, 

contacting a polypeptide coded for by a differentially regulated cancer gene of claim 
30 1 , with a test agent under conditions effective for said test agent to modulate said 
polypeptide, and 
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determining whether said test agent modulates said polypeptide. 

18. A method of claim 16 or 17, wherein said test agent is an antibody. 

5 1 9. A method of detecting polymorphisms in differentially regulated cancer genes 
comprising: 

comparing the structure of : genomic DNA comprising all or part of differentially 
regulated cancer genes, mRNA comprising all or part of differentially regulated cancer genes, 
cDNA comprising all or part of differentially regulated cancer genes, or a polypeptide 
1 0 comprising all or part of differentially regulated cancer genes, wifli the structure of a 

differentially regulated cancer gene as set fortti in a polynucleotide sequence selected fiom 
SEQ ID NOS 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 
60, 62, 64, 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, and 99. 

15 20. A method of identifying a genetic basis for a prostate or breast cancer, or susceptibility 
thereto, comprising: 

determining the association of cancer, usceptibility thereto, with a polynucleotide of 
claim 1. 

20 2 L A method of claim 20, wherein determining is performed by producing a human-linkage 
map using said polynucleotide. 

22. A method of claim 20, wherein determining is performed by comparing the nucleotide 
sequences of said polynucleotide between normal subjects and subjects having cancer. 

25 

23. A non-hittnan, transgenic mammal, or a cell thereof^ whose genome comprises a 
functional disruption of a homolog of a differentially regulated cancer gene of claim 1 . 

24. An antibody which is spedfic-for a differentially regulated cancer gme polypeptide of 
30 claim 5, 6, or 7. 
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25. A method of advertising differentially regulated cancer genes for sale, conmiercial use, 
or licensing, comprising, 

displaying in a computer-readable medium a polynucleotide set forth in SEQ ID NOS 
1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 45, 47, 50, 48, 52, 54, 56, 58, 60, 62, 64, 
5 66, 68, 70, 72, 74, 84, 85, 86, 88, 95, or 99 of claim 1, effective specific fragments thereof, or 
complements thereto. 

26. A method of selectmg a polynucleotide sequence coding for differentially regulated 
cancer genes from a database comprising polynucleotide sequences, comprising 

10 displaying, in a computer-readable medium, a polynucleotide sequence or polypeptide 

sequence for one or more differentially regulated cancer genes of claim 1 , or complements to 
the polynucleotides sequence, 

wherein said displayed sequences have been retrieved from said database upon 
selection by a user. 



15 
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PCP0749 
KIAA0803 



* 20 * 40 * 

MSKIRRKVTVENTKTISDSTSRRPSVFERLGPSTGSTAETQCRNWLKTGNCLYGMTCR 



58 



PCP0749 
KIAA0803 



60 * 80 * 100 * 

FVHGPSPRGKGYSSNyRRSPERPTGDLRERMKNKRQDVDTEPQKRNTEESSSPVRKES 



116 



PCP07 49 
KIAA0803 



120 * 140 * 160 * 

SRGRHREKEDIKITKERTPESEGENVEWETNRDDSDNGDINYDyVHELSLEMRRQKIQ 



174 



PCP0749 
KIAA0803 



180 * 200 * 220 * 

RELMKLEQENMEKREEIIIKKEVSPEVVRSKLSPSPSLRKSSRSPKRKSSPKSSSASK 



232 



240 * 260 * 280 * 
PCP0749 : KDRKTSAVSSPLLDQQRNSKTNQSKBUCGPRTPSPPPPIPEDIALGKKYKEKYKVKDRI : 290 
KIAA0803 : : 

300 * 320 * 340 
PCP0749 : SBKTRDGKDRGRDFERQREKRDKPRSTSPAGQHHSPISSRHHSSSSQSGSSIQRHSPS : 348 
KIAA0803 : : 



PCP0749 
KIAA0803 



PCP0749 
KIAA0803 



PCP0749 
KIAA0803 



PCP0749 
KIAA0803 



PCP0749 
KXAA0803 



PCP0749 
RIAA0803 



PCP0749 
KIAA0803 



* 360 * 380 * 

PRRKRTPSPSYQRTLTPPLRRSASPYPSHSLSSPQRKQSPPRHRSP 



400 



MREKGRHDHERT 
MREKGRHDHERT 



MREKGRHDHBRT 



420 



440 



460 



SQSHDRRHERREDTRGKRDREBCDSREEREYEQDQSSSRDHRDDREPRDGRDRRDARDT 
SQSHDRRHERREDTRGKRDREKDSREEREYEQDQSSSRDHRDDREPRDGRDRRDARDT 



SQSHDRRHERREDTRGKRDREKDSREEREYEQDQSSSRDHRDDREPRDGRDRRDARDT 

* 480 * 500 * 520 



RDRRELRDSRDM] 


RDSREMR 


:DYSRI 


DTKESF 


ldprdsrstrda: 


HDYRDREGRDT 


HRKEDTY 


RDRRELRDSRDM] 


RDSREMF 


IDYSRJ 


3TKESF 


ldprdsrstrda: 


HDYRDREGRDT 


H RKEDTY 



RDRRELRDSRDMRDSREMRDYSRDTRESRDPRDSRSTRDAHDYRDREGRDTHRKEDTY 



540 



560 



580 



PEESRSYGRNHLREESSRTEIRNESRNESRSEIRMDRMGRSRGRVPELPEKGSRGSRG 
PEESRSYGRMHLREESSRTEIRNESRNESRSEIRNDRMGRSRGRVPELPEKGSRGSRG 



PEESRSYGRNHLREESSRTEIRNESRNBSRSEIRNDRMGRSRGRVPELPEKGSRGSRG 

* 620 * 6 



600 



SQTDSHSSNSNYHDSWETRSSYPERDRYPERDNRDQARDSSFERRHGERDRRDMRERD 
S0IDSHSSW3NYHDSWETRSSYPERDRYPERDMRDQARDSSFERRHGERDRRDNRERD 



SQIDSHSSNSNYHDSWETRSSYPERDRYPERDNRDQARDSSFERRHGERDRRDNRERD 



40 



660 



680 



QRPSSPIRHQGRWDELERDERREERRVDRVDDRRDERARERDRERERDRERERERERE 
QRPSSPIRHQGRNDELERDERREERRVDRVDDRRDERARERDRERERDRERERERERE 



QRP S S PI RHQGRNDELERDERREERRVDRV DDRRDERARERDRERERDRERERERERE 



700 



720 



740 



RDREP.EKERELERERAREREREREKERDRERDRDRDHDRERERERERDREKERERERE 
RDREREKERELERERAREREREREKERDRERDRDRDHDRERERERERDREKERERERE 



RDREREKERELERERAREREREREKERDRERDRDRDHDRERERBRERDRERERERERE 



406 
12 



464 
70 



522 
128 



580 
186 



638 
244 



696 
302 



754 
360 



FIG.IA 




FIG. IB 
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* 20 * 40 * 6 
PCP0389 : MDLHKQWENTETNWHKEKMELLOQFDNERKEWESQWKIMQKKIEELCREVKLWRKXNIN : 59 
KIAA0408 : : 



PCP0389 
KIAA0408 



0 * 80 * ICQ * 1 

ESAKIIDLYHEKTIPEECVIESSPNYPDLGQSEFIRTNHKDGLRKENKREQSLVSGGNC 



118 
1 



20 



140 



160 



PCP0389 
KIA£l0408 



PCP0389 
KIAA0408 



PCP0389 
KIAA0408 



PCP0389 
KIAA0408 



PCP0389 
KIAA0408 



PCP0389 
KIAA0408 



PCP0389 
KIAA0408 



PCP0389 
K1AA0408 



PCP0389 
KIAA0408 



PCP0389 
KIAA0408 



CKEQKATKKSKVGFLDPLATDNQKECEAWPDLRTSEEDSKSCSGALSTALEELAKVSEE 
CKEQKATKKSKVGFLDPLATDNQKECEAWPDLRTSEEDSKSCSGALSTALEELAKVSEE 



CKEQKATKKSKVGFLDPLATDNQKECEAWPDLRTSEEDSKSCSGALSTALEELAKVSEE 
180 * 200 * 220 * 



LCSFQEEIRKRSNHRRMKSDSFLQEMPNVTMIPHGDPMINNDQCILPISLEKEKQKNRK 
LCSFQEEIRKRSNHRRMKSDSFLQEMPNVTNIPHGDPMINNDQCILPISLEKEKQKNRK 



LCSFQEEIRKRSNHRRtfiKSDSFLQEMPNVTNIPHGDPMINNDQCILPISLEKEKQKNRK 
240 * 260 * ' 280 * 



MLSCTNVLQSNSTKKCGIDTIDLKRNETPPVPPPRSTSRNFPSSDSEQAYERWKERLDH 
WLSCTNVLQSNSTPCKCGIDTIDLKRNETPPVPPPRSTSRNFPSSDSEQAYERWKERLDH 



NLSCTNVLQSNSTKKCGIDTIDLKRNETPPVPPPRSTSRNFPSSDSEQAYERWKERLDH 
300 * 320 * 340 * 



MSWVPHEGRSKRMYNPHFPLRQQEMSMLYPNEGKTSKDGIIFSSLVPEVKIDSKPPSNE 
MSWVPHEGRSKRNYNPHFPLRQQEMSMLYPNEGKTSKDGIIFSSLVPEVKIDSKPPSNE 



NSWVPHEGRSKRNYNPHFPLRQQEMSMLYPNEGKTSKDGIIFSSLVPEVKIDSKPPSNE 
360 * 380 * 400 * 



DVGLSMWSCDIGIGAKRSPSTSWFQKTCSTPSNPPCYEMVIPDHPAKSHPDLHVSNDCSS 
DVGLSMWSCDIGIGAKRSPSTSWFQKTCSTPSNPKYEMVIPDHPAKSHPDLHVSNDCSS 



DVGLSMWSCDIGIGAKRSPSTSWFQKTCSTPSNPKYEMVIPDHPAKSHPDLHVSNDCSS 
420 * 440 * 460 * 



SVAESSSPLRNFSCGFERTTRNEKLAAPvTDEFNRTVFRTDRNCQAIQQNHSCSKSSEDL 
SVAESSSPLRNFSCGFERTTRMEKLAAKTDEFNRTVFRTDRMCQAIQQNHSCSBCSSEDL 



SVAESSSPLRNFSCGFERTTRNEKLAAKTDEFNRTVFRTDRNCQAIQQNHSCSKSSEDL 
480 * 500 * 520 * 



KPCDTSSTHTGSISQSNDVSGIWKTNAHMPVPMEMVPDNPTKKSTTGLVRQMQGHLSPR 
KPCDTSSTHTGSI3QSNDVSGIWKTNAHMPVPMENVPDNPTKKSTTGLVRQMQGHLSPR 



KPCDTSSTHTGSISQSNDVSGIWKTNAHMPVPMENVPDNPTKKSTTGLVRQMQGHLSPR 

540 * 560 * 580 * 



SYRNMLHEHDWRPSNLSGRPRSADPRSNYGVVEKLLKTYETATESALQNSKCFQDNWTK 
SYRNMLHEHDWRPSNLSGRPRSADPRSMYGVVEKLLKTYETATESALQNSPCCFQDNWTK 



SYRNMLHEHDWRPSNLSGRPRSADPRSNYGWEKLLKTYETATESALQNSKCFQDNWTK 
600 * 620 * 640 



CNSDVSGGATLSQHLEMLQMEQQFQQKTAVWGGQEVKQGIDPKKITEESMSVNASHGKG 
CNSDVSGGATLSQHLEMLQMEQQFQQKTAVWGGQEVKQGIDPKKITEESMSVNASHGKG 



CNSDVSGGATLSQHLEMLQMEQQFQQKTAVWGGQEVKQGI DPKKI TEESMS VNASHGKG 
* 660 * 680 * 700 



FSRPARPANRRLPSRWASRSPSAPPALRRTTHNYTISLRSEALMV 
FSRPARPANRRLPSRWASRSPSAPPALRRTTHNYTISLRSEALMV 



177 
60 



236 
119 



295 
178 



354 
237 



413 
296 



472 
355 



531 
414 



590 
473 



649 
532 



694 
577 



FSRPARPANRRLPSRWASRSPSAPPALRRTTHNYTXSLRSEALMV 



FIG. 2 
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PCP0814 
NM 030817 



PCP0814 
MM 030817 



PCP0814 
NM 030817 



PCP0814 
NM 030817 



PCP0814 
NM 030817 



* 20 * 

MFRAPCHRLRARGTRKARAGAWRGCT FPCLGKG 



60 



80 



40 



MERPAAREPHGPDALRRFQGLLLDR 
MERPAAREPHGPDALRRFQGLLLDR 



MCRPAAREPHGPOALRRFQGLLLDR 
* 100 * 



RGRLHRQVLRLREVARRLERLRRRSLVANVAGSSLSATGALAAIVGLSLSPVTLGTSL 
RGRLHRQVLRLREVARRLERLRRRSLVANVAGSSLSATGALAAIVGLSLSPVTLGTSL 



RGRLHRQVLRLREVARRLERLRRRSLVANVAGSSLSATGALAAIVGLSLSPVTLGTSL 
120 * 140 * 160 * 



LVSAVGLGVATAGGAVTITSDLSLIFCNSRELRRVQEIAATCQDQMREILSCLEFFCR 
LVSAVGLGVATAGGAVTITSDLSLIFCNSRELRRVQEIAATCQDQMREILSCLEFFCR 



LVSAVGLGVATAGGAVTITSDLSLIFCNSRELRRVQEIAATCQDQMREILSCLEFFCR 

180 * 200 * 220 * 



WQGCGDRQLLQCGRNASIALYNSVYFIVFFGSRGFLIPRRAEGDTKVSQAVLKAKIQK 
WQGCGDRQLLQCGRNASIALYNSVYFIVFFGSRGFLIPRRAEGDTKVSQAVLKAKIQK 



WQGCGDRQLLQCGRNASIALYNSVYFIVFFGSRGFLIPRRAEGDTKVSQAVLKAKIQK 

240 * 260 * 



LAESLESCTGALDELSEQLESRVQLCTKSSRGHDLKISADQRAGLFF 
LAESLESCTGALDELSEQLESRVQLCTKSSRGHDLKISADQRAGLFF 



LAESLESCTGALDELSEQLESRVQLCTKSSRGHDLKISADQRAGLFF 



58 
25 



116 
83 



174 
141 



232 
199 



FIG. 3 
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PCP0623 
NM_015384 
NM 133433 



* 20 * 40 * 

MNGDMPHVPITTIiAGIASLTDLLNQLPLPSPLPATTTKSLLFNARIAEEVNCLLACR 



57 



PCP0623 
NM_015384 
NM 133433 



60 * 80 ♦ 100 * 

DDNLVSQLVHSLNQVSTDHIBLKDNLGSDDPEGDIPVLLQAVLARSPNVFREKSMQN 



114 



PCP0623 
NM_015384 
NM 133433 



120 * 140 * 160 * 

RYVQSGMMMSQYKLSQNSMHSSPASSNYQQTTISHSPSSRFVPPQTSSGNRFMPQQN 



171 



PCP0623 
NM_015384 
NM 133433 



180 * 200 * 220 

SPVPSPYAPQSPAGYMPYSHPSSYTTHPQMQQASVSSPIVAGGLRNIHDNKVSGPLS 



228 



PCP0623 
NM_015384 
NM 133433 



* 240 * 260 * 280 

GNSANHHADNPRHGSSEDYLHMVHRLSSDDGDSSTMRNAASFPLRSPQPVCSPAGSE 



285 



PCP0623 
NM_015384 
NM 133433 



* 300 * 320 * 340 

GTPKGSRPPLILQSQSLPCSSPRDVPPDILLDSPERKQKKQKKMKLGKDEKEQSEKA 



342 



PCP0623 
NM_015384 
NM 133433 



* 360 * 380 * 40 

AMYDIISSPSKDSTKLTLRLSRVRSSDMDQQEDMISGVENSNVSENDIPFNVQYPGQ 



399 



PCP0623 
NM_015384 
NM 133433 



0 * 420 * 440 * 

TSKTPITPQDINRPLNAAQCLSQQEQTAFLPANQVPVLQQNTSVAAKQPQTSVVQNQ 



456 



PCP0623 
NM_015384 
NM 133433 



460 * 480 * 500 * 

QQISQQGPIYDEVELDALAEIERIERESAIERERFSKEVQDKDKPLKKRKQDSYPQE 



513 



PCP0623 
NM_015384 
NM 133433 



520 * 540 

AGGATGGNRPASQETGSTGNGSRPAL 



560 



MVSIDLHQAGRVDSQASITQDSDSIKKPEEI 
MVSIDLHQAGRVDSQASITQDSDSIKKPEEI 
t-IVSIDLHQAGRVDSQASITQDSDSIKKPEEI 



570 
31 
31 



PCP0623 
NM_015384 
NM 133433 



580 



600 



620 



KQCMDAPVSVLQEDIVGSLKSTPENHPETPKKKSDPELSKSEMKQSESRLAESKPNE 
KQCNDAPVSVLQEDIVGSLKSTPEMHPETPKKKSDPELSKSEMKQSESRLAESKPNE 
KQCNDAPVSVLQEDIVGSLKSTPENHPETPECKKSDPELSKSEMKQSESRLAESKPNE 



627 
88 
88 



FIG. 4A 
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PCP0623 
NM_015384 
NM 133433 



660 



680 



NRLVETKSSENKLETKVETQTEELKQWESRTTECKQNESTIVEPKQMENRLSDTRPN 
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* 1340 * 1360 * 1380 

PC0177A : SGATVPPKEKBCNLEFFHEDVRKSDVEYENGPQMEFQ : 1203 

PC0177B : SGATVPPKEKKNLEFFHEDVRKSDVEYENGPQMEFQ : 1168 

PC0177C : SGATVPPKEKKNLEFFHEDVRKSDVEYENGPQMEFQ : 1168 

PC0177D : SLTSYKAQNGSSSKATPSTAKETS : 1309 

KIAA1217 : SLTSYKAQNGSSSKATPSTAKETS : 1259 

* 1400 * 1420 * 1440 
PC0177A : KVTTGAVRPSDPPKWERGMENSISDASRTSEYKTEIIMKENSISNMSLLRDSRNYSQETV : 1263 
PC0177B : KVTTGAVRPSDPPKWERGMENSISDASRTSEYKTEIIMKENSISNMSLLRDSRNYSQETV : 1228 
PC0177C : KVTTGAVRPSDPPKWERGMENSISDASRTSEYKTEIIMKENSISNMSLLRDSRNYSQETV ; 1228 

PC0177D : : 

KIAA1217 : - 

* 1460 * 1480 * 1500 
PC0177A : PKASFGFSGISPLEDEINKGSKISGLQYSIPDTENQTLNYGKTKEMEKQNTDKCHVSSHT : 1323 
PC0177B : PKASFGFSGISPLEDEINKGSKISGLQYSIPDTENQTLNYGKTKEMEKQNTDKCHVSSHT : 1288 
PC0177C : PKASFGFSGISPLEDEINKGSKISGLQYSIPDTENQTLNYGKTKEMEKQNTDKCHVSSHT : 1288 

PC0177D : ! 

KIAA1217 : 5 

* 1520 * 1540 * 1560 
PC0177A : RLTESSVHDFKTEDQEVITTDFGQWLRPKEARHANVNPNEDGESSSSSPTEENAATDNI : 1383 
PC0177B : RLTESSVHDFKTEDQEVITTDFGQWLRPKEARHANVNPNEDGESSSSSPTEENAATDNI : 1348 
PC0177C : RLTESSVHDFKTEDQEVITTDFGQWLRPKEARHANVNPNEDGESSSSSPTEENAATDNI : 1348 

PC0177D ■ : J 

KIAA1217 : 7 ' 

* 1580 * 1600 * 1620 
PC0177A : AFMITETTVQVLSSGEVHDIVSQKGEDIQTVNIDARKEMTPRQEGTDNEDPWCLDKKPV : 14 43 
PC0177B : AFMITETTVQVLSSGEVHDIVSQKGEDIQTVNIDARKEMTPRQEGTDNEDPVVCLDKKPV : 1408 
PC0177C : AFMITETTVQVLSSGEVHDIVSQKGEDIQTVNIDARKEMTPRQEGTDNEDPWCLDKKPV : 1408 

PC0177D : ' 

KIAA1217 : 

* 1640 * 1660 * 1680 
PC0177A : IIIFDEPMDIRSAYKRLSTIFEECDEELERMMMEEKIEEEEEEENGDSWQNNNTSQMSH : 1503 
PC0177B : IIIFDEPMDIRSAYKRIjSTIFEECDEELERMMMEEKIEEEEEEENGDSWQNNNTSQMSH : 14 68 
PC0177C : IIIFDEPMDIRSAYKRLSTIFEECDEELERMMMEEKIEEEEEEENGDSWQNNNTSQMSH : 1468 

PC0177D : ' 

KIAA1217 : - 

* 1700 * 1720 * 1740 
PC0177A : KKVAPGNLRTGQQVETKSQPHSLATETRNPGGQEMNRTELNKFSHVDSPNSECKGEDATD : 1563 
PC0177B : KKVAPGNLRTGQQVETKSQPHSLATETRNPGGQEMNRTELNKFSHVDSPNSECKGEDATD : 1528 
PC0177C : KKVAPGNLRTGQQVETKSQPHSLATETRNPGGQEMNRTELNKFSHVDSPNSECKGEDATD : 1528 

PC0177D : = 

KIAA1217 : ! 
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* 260 * 280 * 300 
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* 320 * 340 * 360 

PCP0 454 : PSRHPLLLLHQSFQPLESIMKCVQMSWMVILVGPASVGKTSLVQLLAHLTGHTLKIMAMN : 360 
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PCP0454 : VYVCSQHSPANRKLVQALLEKHVSSLRAHETWGDSILGMGIiWPDSVPSALFATEDSHLST : 720 
KIAA0301 : : 
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* 740 * 760 * 780 
PCP0454 : VRRDGQILVYCLNRMSMKTSSWTRSQPFTLQDLEKIMQSPSPENLKFNAVEVNTYWIDEP : 78 0 
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KIAA0301 : : 

* 1280 * 1300 * 1320 
PCP0454 : LLHHQKVSPEEITSLWSELFNSMFMSFWSSTVTTNPEYWLMWNPLPGMQQREAPKSVLDS : 1320 
KIAA0301 : : 

* 1340 * 1360 * 1380 
PCP0454 : TLKGPGNLNRPIFSKCCFEVLTSSWRASPWDVSGLPILSSSHVTLGEWVERTQQLQDISS : 1380 
KIAA0301 : : 

* 1400 * 1420 * 1440 
PCP0454 : MLWTNMAISSVAEFRRTDSQLQGQVLFRHLAGLAELLPESRRQEYMQNCEQLLLGSSQAF : 1440 
KIAA0301 : : 



FIG. 14B 



wo 03/064599 



PCT/US03/01943 



32/54 



* 1460 * 1480 * 1500 
PCP0454 : QHVGQTLGDMAGQEVLPKELLCQLLTSIiHHFVGEGESKRSLPEPAQRGSLWVSLGLLQIQ : 1500 
KIAA0301 : — 5 

* 1520 * 1540 * 1560 
PCP0454 : TWLPQARFDPAVKREYKLNYVKEELHQLQCEWKTRNLSSQLQTGRDLEDkvWSYSHPHV : 1560 
KIAA0301 : : 

* 1580 * 1600 * 1620 
PCP0454 : RLLRQRMDRLDNLTCKLLKKQAFRPQLPAYBSLVQEIHHYyTSIAKAPAVQDLLTRLLQA : 1620 
KIAA0301 : : 

\ 

* 1640 * 1660 * 1680 

PCP0454 : LHIDGPRSAQVAQSLLKEEASWQQSHHQFRKRLSEEYTFYPDAVSPLQASILQLQHGMRL : 1680 
KIAA0301 : : 

* 1700 * 1720 * 1740 
PCP0454 : VASELHTSLYSSMVGADRLGTLATALLAFPSVGPTFPTYYAHADTLCSVKSEEVLRGLGK : 1740 
KIAA0301 : : 

* 1760 * 1780 * 1800 

. PCP0454 : LILKRSGGKELEGKGQKACPTREQLLMNALLYLRSHVLCKGELDQRALQLFRHVCQEIIS : 1800 
KIAA0301 : : 

* 1820 * 1840 * 1860 

PCPO 4 5 4 : EWDEQERIAQEKAEQESGLYRYRSRNSRTALSEEEEBEREFRKQFPLHEKDFADILVQPT : 1860 
KIAA0301 : : 

* 1880 * 1900 * 1920 

PCPO 4 54 : LEENKGTSDGQEEEAGTNPALLSQNSMQAV^^^^^^^QgQ^g^j^Q^gSgg : 1920 
KIAA0301 : mKBBSKKBS^^KBBBBS^^S^ : 30 



PCP0454 
KIAA0301 



1940 



1960 



1980 



SLFLSCYQTGASLVTHFYPLMGVELNDRLLGSQLLACTLSHNTLFGEAPSDLMVKPDGPY 
SLFLSCYQTGASLVTHFYPLMGVELNDRLLGSQLLACTLSHNTLFGEAPSDLMVKPDGPY 



1980 
90 



PCP0454 
KIAA0301 



* 2000 



* 2020 



2040 



DFYQHPNVPEARQCQPVLQGFSEAVSHLLQDWPEHPALEQLLVVMDRIRSFPLSSPISKF 
DFYQHPWVPEARQCQPVLQGFSEAVSHLLQDWPEHPALEQLLVVMDRIRSFPLSSPISKF 



2040 
150 



PCP0454 
KIAA0301 



* 2060 



* 2080 



2100 



LNGLEILLAKAQDWEENASRALSLRFCHLDLISQMIIRWRKLELNCWSMSLDWTMKRHTEK 
LNGLEILLAKAQDWEENASRALSLRKHLDLISQMIIRWRKLELNCWSMSLDNTMKRHTEK 



2100 
210 



PCP0454 
KXAA0301 



* 2120 



* 2140 



2160 



STKHWFSIYQMLEKHMQEQTEEQEDDKQMTLMLLVSTLQAFIEGSSLGEFHVRLQMLLVF 
STKHWF3IYQMLEKHMQEQTEEQEDDKQMTLMLLVSTLQAFIEGSSLGEFHVRLQMLLVF 



2160 
270 



FIG. 14C 



wo 03/064599 



PCT/US03/01943 



33/54 



* 2180 



* 2200 



2220 



PCP0454 
KIAA0301 



HCHVLLMPQVEGKDSLCSVLWNLYHYYKQFFDRVQAKIVELRSPLEKELKEFVKISKWND 
HCHVLLMPQVEGKDSLCSVLWNLYHYYKQFFDRVQAKIVELRSPLEKELKEFVKISKWND 



2220 
330 



PCP0454 
KIAA0301 



* 2240 



* 2260 



2280 



VSFVJSIKQSVEKTHRTLFKFMKKFEAVLSEPCRSSLVESDKEEQPDFLPRPTDGAASELS 
VSFWSIKQSVEKTHRTLFKFMKKFEAVLSEPCRSSLVESDKEEQPDFLPRPTDGAASELS 



2280 
390 



PCP0454 
KIAA0301 



* 2300 



* 2320 



2340 



SIQNLNPJE^LRETLLAQPAAGQATIPEWCQGAAPSGLEGELLRRLPKLRKRMRKMCLTFMK 
SIQNLNRALRETLLAQPAAGQATIPEWCQGAAPSGLEGELLRRLPKLRKRMRKMCLTFMK 



2340 
450 



PCP0454 
KIAA0301 



* 2360 



* 2380 



2400 



ESPLPRLVEGLDQFTGEVISSVSELQSLKVEPSAEKEKQRSEAKHILMQKQRALSDLFKH 
ESPLPRLVEGLDQFTGEVISSVSELQSLKVEPSAEKEKQRSEAKHILMQKQRALSDLFKH 



2400 
510 



PCP0454 
KIAA0301 



* 2420 



* 2440 



2460 



LAKIGLSYRKGLAWARSKNPQEMLHLHPLDLQSALSIVSSTQEADSRLLTEISSSWDGCQ 
LAKIGLSYRKGLAWARSKMPQEMLHLHPLDLQSALSIVSSTQEADSRLLTEISSSWDGCQ 



2460 
570 



PCP0454 
KIAA0301 



* 2480 



* 2500 



2520 



KYFYRSLARHARLNAALATPAKEMGMGNVERCRGFSAHLMKMLVRQRRSLTTLSEQWIIL 
KYFYRSLARHARLWAALATPAKEMGMGNVERCRGFSAHLMKMLVRQRRSLTTLSEQWIIL 



2520 
630 



PCP0454 
KIAA0301 



* 2540 



* 2560 



2580 



RNLLSCVQEIHSRLMGPQAYPVAFPPQDGVQQWTERLQHLAMQCQILLEQLSWLLQCCPS 
RWLLSCVQEIHSRLMGPQAYPVAFPPQDGVQQWTERLQHLAMQCQILLEQLSWLLQCCPS 



2580 
690 



PCP0454 
KIAA0301 



2600 



2620 



2640 



VGPAPGHGMVQVLGQPPGPCLEGPELSKGQLCGVVLDLIPSNLSYPSPIPGSQLPSGCRM 
VGPAPGHGMVQVLGQPPGPCLEGPELSKGQLCGVVLDLIPSNLSYPSPIPGSQLPSGCRM 



2640 
750 



PCP0454 
^KIAA0301 



* 2660 



* 2680 



2700 



RKQDHLWQQSTTRLTEMLKTIKTVKADVDKIRQQSCETLFHSWKDFEVCSSALSCLSQV5 
RKQDHLWQQSTTRLTEMLKTIKTVKADVDKIRQQSCETLFHSWKDFEVCSSALSCLSQVS 



2700 
810 



PCP0454 
KIAA0301 



* 2720 



* 2740 



2760 



VHLQGLESLFILPGMEVEQRDSQMALVESLEYVRGEISKAMADFTTWKTHLLTSDSQGGN 
VHLQGLESLFILPGMEVEQRDSQMALVESLEYVRGEISKAMADFTTWKTHLLTSDSQGGN 



2760 
870 



PCP0454 
KIAA0301 



* 2780 



* 2800 



2820 



QMLDEGFVEDFSEQMEIAIRAILCAIQNLEERKNEKAEENTDQASPQEDYAGFERLQSGH 
QMLDEGFVEDFSEQMEIAIRAILCAIQNLEERKNEKAEENTDQASPQEDYAGFERLQSGH 



2820 
930 



PCP0454 
KIAA0301 



* 2840 



* 2860 



2880 



LTKLLEDDFWADVSTLHVQKIISAISELLERLKSYGEDGTAAKHLFFSQSCSLLVRLVPV 
LTKLLEDDFWADVSTLHVQKIISAISELLERLKSYGEDGTAAKHLFFSQSCSLLVRLVPV 



2880 
990 



FIG. 14D 



wo 03/064599 



PCT/US03/01943 



34/54 



PCP0454 
KIAA0301 



2900 



2920 



2940 



LSSYSDLVLFFLTMSLATHRSTAKLLSVLAQVFTELAQKGFCLPKEFMEDSAGEGATEFH 
LSSYSDLVLFFLTMSLATHRSTAKLLSVLAQVFTELAQKGFCLPKEFMEDSAGEGATEFH 



: 2940 
: 1050 



PCP0454 
KIAA0301 



2960 



2980 



3000 



DYEGGGIGEGEGMKDVSDQIGNEEQVEDTFQKGQEKDKEDPDSKSDIKGEDNAIEMSEDF 
DYEGGGIGEGEGMKDVSDQIGWEEQVEDTFQKGQEKDKEDPDSKSDIKGEDNAIEMSEDF 



: 3000 
: 1110 



PCP0454 
KIAA0301 



3020 



3040 



3060 



dgkmhdgeleeqeeddeksdseggdldkhmgdlngeeadklderlwgdddeeedeeeedn 
dgkmhdgeleeqeeddef:sdseggdldkhmgdlngeeadklderlwgdddeeedeseedn 



3060 
1170 



PCP0454 
KIAA0301 



3080 



3100 



3120 



kteetgpgmdeedselvakddnldsgnsnkdksqqdkkeekeeaeaddggqgedkineqi 
kteetgpgmdeedselvakddnldsgnsb7kdksqqdkkeekeeaeaddggqgedkineqi 



3120 
1230 



PCP0454 
KIAA0301 



3140 



3160 



3180 



DERDYDENEVDPYHGNQEKVPEPEALDLPDDLNLDSEDKNGGEDTDNEEGEEENPLEIKE 
DERDYDEMEVDPYHGNQEKVPEPEALDLPDDLNLDSEDKNGGEDTDNEEGEEENPLEIKE 



3180 
1290 



PCP0454 
KIAA0301 



3200 



3220 



3240 



KPEEAGHEAEERGETETDQNESQSPQEPEEGPSEDDKAEGEEEMDTGADDQDGDAAQHPE 
KPEEAGKEAEERGETETDQNESQSPQEPEEGPSEDDKAEGEEEMDTGADDQDGDAAQHPE 



3240 
1350 



PCP0454 
KIAA0301 



3260 



3280 



3300 



EHSEEQQQSVEEKDKEADEEGGENGPADQGFQPQEEEEREDSDTEEQVPEALERKEHASC 
EHSEEQQQSVEEKDKEADEEGGENGPADQGFQPQEEEEREDSDTEEQVPEALERKEHASC 



3300 
1410 



PCP0454 
KIAA0301 



3320 



3340 



3360 



GQTGVENMQMTQAMELAGAAPEKEQGKEEHGSGAADANQAEGHESNFIAQLASQKHTRKN 
GQTGVENMQNTQAMELAGAAPEKEQGKEEHGSGAADAMQAEGHESNFIAQLASQKHTRKN 



3360 
1470 



PCP0454 
KIAA0301 



3380 



3400 



3420 



TQSFKRKPGQADNERSMGDHNERVHKRLRTVDTDSHAEQGPAQQPQAQVEDADAFEHIKQ 
TQSFKRKPGQADNERSMGDHNERVHKRLRTVDTDSHAEQGPAQQPQAQVEDADAFEHIKQ 



3420 
1530 



PCP0454 
KIAA0301 



3440 



3460 



3480 



GSDAYDAQTYDVASKEQQQSAKDSGKDQEEEEIEDTLMDTEEQEEFKAADVEQLKPEEIK 
GSDAYDAQTYDVASKEQQQSAKDSGKDQEEESIEDTLMDTEEQEEFKAADVEQLKPEEIK 



3480 
1590 



PCP0454 
KIAA0301 



3500 



3520 



3540 



SGTTAPLGFDEMEVEIQTVKTEEDQDPRTDKAHKETEMEKPERSRESTIHTAHQFLMDTI 
SGTTAPLGFDEMEVEIQTVKTEEDQDPRTDKAHKETENEKPERSRESTIHTAHQFLMDTI 



3540 
1650 



PCP0454 
KIAA0301 



3560 



3580 



3600 



FQPFLKDVWELRQELERQLEMWQPRESGNPEEEKVAAEMWQSYLILTAPLSQRLCEELRL 
FQPFLKDVWELRQELERQLEMWQPRESGNPEEEKVAAEMWQSYLILTAPLSQRLCEELRL 



3600 
1710 



FIG. 14E 



wo 03/064599 



PCT/US03/01943 



35/54 



PCP0454 
KIAA0301 



3620 



* 3640 



3660 



ILEPTQAAKLKGDYRTGKRLNIRKVIPYIASQFRKDKIWLRRTKPSKRQYQICLAIDDSS 

ileptqaaklkgdyrtgkrlnirkvipyiasqfrkdkiwlrrtkpsp:rqyqiclaiddss 



3660 
1770 



PCP0454 
KIAA0301 



* 3680 



* 3700 



3720 



SMVDNHTKQLAFESLAVIGNALTLLEVGQIAVCSFGESVKLLHPFHEQFSDYSGSQILRL 
SMVDMHTKQLAFESLAVIGMALTLLEVGQIAVCSFGESVKLLHPFHEQFSDYSGSQILRL 



3720 
1830 



PCP0454 
KIAA0301 



3740 



* 3760 



3780 



CKFQQKKTKIAQFLESVANMFAAAQQLSQNISSETAQLLLVVSDGRGLFLEGKERVLAAV 
CKFQQKKTKIAQFLESVANMFAAAQQLSQMISSETAQLLLVVSDGRGLFLEGKERVLAAV 



3780 
1890 



PCP0454 
KIAA0301 



3800 



3820 



3840 



QAARNANIFVIFVVLDNPSSRDSILDIKVPIFKGPGEMPEIRSYMEEFPFPYYIILRDVN 
QAARWANIFVIFVVLDMPSSRDSILDIKVPIFKGPGEMPEIRSYMEEFPFPYYIILRDVN 



3840 
1950 



PCP0454 
KIAA0301 



3860 



AL PET L S DALRQW FE L VT AS D H P 
ALPETLS DALRQWFELVTASDH P 



FIG. 14F 



wo 03/064599 



PCT/US03/01943 



36/54 



* 20 * 40 * 60 
PCP0557 : MAAAAVWPAEWIKNWEKSGRGEFLHLCRILSENKSHDSSTYRDFQQALYELSYHVIKGN : 60 
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AF441770 : ffiHBSSBBMJclil:tal»t** «iis<WATiMaaMe«iM^ : 483 

* 740 * 760 * 780 

PCP0557 : : 780 

AF441770 : BBiSglMEBgaSinSBriiiiMiiaiiMiiiifflRTOM^^ : 543 
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